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ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 
BIOGENESIS, ADHESION AND ACTIVITY; CO-CRYSTALS OF 
PILUS SUBUNITS AND METHODS OF USE THEREOF 

This invention was made in part with Government support under National Institutes of 
Health Grants RO1DK51406, R01AI29549 and RO1GM54033. The Government has certain 
rights in the invention. 

This application claims priority to co-pending United States provisional patent 
application Ser. No. 60/148,280, filed August 11, 1999, incorporated herein by reference. 

Field of the Invention 

The present invention relates to compounds and methods for the treatment of diseases 
caused by tissue-adhering pilus-forming bacteria. More specifically, the invention relates to 
pharmaceutical preparations comprising substances capable of interfering with the binding of 
periplasmic chaperones to pilus subunits as well as pharmaceutical compounds capable of 
interfering with the binding between pilus subunits. 

The present invention further relates to crystalline forms of pilus-subunit co- 
complexes, the high-resolution X-ray diffraction structures and atomic structure coordinates 
obtained therefrom. The pilus subunit co-crystals of the invention and the atomic structural 
information obtained therefrom are useful for solving structures of related proteins, and for 
screening for, identifying and/or designing compounds that bind periplasmic chaperones or 
pilus subunits and thus prevent the assembly and/or biological function of pili. 



Background of the Invention 

25 Many pathogenic Gram-negative bacteria such as Escherichia coli, Haemophilus 

influenzae, Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, Yersinia 
enterocolitica, Yersinia perstis, Helicobacter pylori and Klebsiella pneumoniae assemble 
hair-like adhesive organelles called pili on their surfaces. Pili are thought to mediate 
microbial attachment, often the essential first step in the development of disease, by binding 
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to receptors present in host tissues and may also participate in bacterial-bacterial interactions 
important in biofilm formation. 

Uropathogenic strains of E. coli express P and type 1 pili that bind to receptors present 
in uroepithelial cells. Adhesive P pili are virulence determinants associated with 

5 pyelonephritic strains of E. coli whereas type 1 appear to be more common in E. coli causing 
cystitis. The adhesin present at the tip of the pilus, PapG binds to the Gal (l-4)Gal moiety 
present in the glycolipids and glycoproteins, while the type 1 adhesin, FimH, binds D- 
mannose present in glycolipids and glycoproteins. 

Type 1 pili are adhesive fibers expressed in E. coli as well as in most of the 

10 Enterobacteriaceae family. The type 1 pilus is a right handed helix with about 3 subunits per 
turn, a diameter of approximately 70 A, a central pore of about 20-25 A, and a rise per 
subunit of about 8 A. See G.E. Soto et al., EMBOJ., 17: 6155 (1998). Type 1 pili are 
composite structures in which a short tip fibrillar structure containing FimG and the FimH 
adhesin (and possibly the minor component FimF as well) are joined to a rod comprised 

15 predominantly of FimA subunits. See Jones et al., Proc. Natl. Acad. Sci. U.S.A., 92: 2081 
(1995). The FimH adhesin mediates binding to mannose-oligosaccharides. See S.N. 
Abraham et al., Nature, 336: 682 (1988); K.A. Krogfelt et al., Infect. Immun., 58: 1995 
(1990). In uropathogenic E. coli, this binding event has been shown to play a critical role in 
bladder colonization and disease. 

20 Type 1 pilus biogenesis proceeds by way of a highly conserved chaperone/usher 

pathway that is involved in the assembly of over 25 adhesive organelles in the Gram-negative 
bacteria. See G.E. Soto and S. Hultgren, J. Bacteriol, 181: 1059 (1999). The usher forms an 
oligomeric channel in the outer membrane with a pore size of approximately 2.5 nm and 
mediates subunit translocation across the outer membrane. See D.G. Thanassi et al., Proc. 

25 Natl. Acad. U.S.A., 95: 3146 (1998). 

P pili is a heteropolymeric surface fiber with an adhesive tip and consists of two major 
sub-assemblies, the pilus rod and the tip fibrillum. The pilus rod is a thick rigid rod made up 
of repeating PapA subunits arranged in a right-handed helical cylinder whereas the tip 
fibrillum is a thin, flexible tip fiber extending from the distal end of the pilus rod and is 



3 WSHU 2005.1 

PATENT 

composed primarily of repeating PapE subunits arranged in an open helical configuration. 
Two components of the tip fibrillum, PapK and PapF, act as adaptors. PapK is thought to 
link the pilus rod to the base of the tip fibrillum and regulates the length of the tip fibrillum: 
its incorporation terminates its growth and nucleates the formation of the pilus rod. PapF is 

5 thought to join the PapG adhesin to the distal end of the flexible tip fibrillum. 

The biogenesis of P pili also occurs via the highly conserved chaperone/usher 
pathway. See T.G. Thanassi et al., Curr. Opin. Microbiol, 1: 223 (1998); D.L. Hung et al., 
EMBO J., 15: 3792 (1996). P pili are adhesive organelles encoded by eleven genes in the pap 
(pilus associated with pyelonephritis) gene cluster found on the chromosome of 

10 uropathogenic strains of E. coli. Six genes encode structural pilus subunits, PapA, PapH, 
PapK, PapE, PapF and PapG. See S.J. Hultgren et al., Cell 73: 887 (1993). 

In P pili, two of the genes in the pap operon, papD and papC, encode the chaperone 
and usher, respectively. Chaperones such as PapD in E. coli are required to bind to pilus 
proteins imported into the periplasmic space, partition them into assembly component 

15 complexes and prevent non-productive aggregation of the subunits in the periplasm. See 
Kuehn M. J. et al, Proc. Natl Acad. Sci. USA 88: 10586 (1991). PapD is a periplasmic 
chaperone that mediates the assembly of P pili. Detailed structural analysis has revealed that 
the PapD chaperone is the prototype member of a conserved family of periplasmic 
chaperones in Gram-negative bacteria. Periplasmic chaperones consist of two 

20 immunogloblin-like domains with a deep cleft between the two domains. See A. Holmgren 
and C.I. Branden, Nature, 342: 248 (1989); M. Pellecchia et al., Nature Struct. Biol, 5: 885 
(1998). Further, all members of the periplasmic chaperone superfamily have a conserved 
hydrophobic core that maintains the overall features of the two domains. 

Periplasmic chaperones, along with outer membrane ushers, constitute a molecular 

25 mechanism necessary for guiding biogenesis of adhesive organelles in Gram-negative 

bacteria. These chaperones function to cap and partition interactive subunits imported into 
the periplasmic space into assembly competent co-complexes, making non-productive 
interactions unfavorable. The chaperone-subunit co-complexes are targeted to the outer 
membrane usher where subunits, or ushers, assemble in a specific order to form a pilus. 
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During pilus biogenesis, PapD binds to and caps interactive surfaces on pilus subunits and 
prevents their premature aggregation in the periplasm. PapD binds to each of the pilus 
subunit types as they emerge from the cytoplasmic membrane and escorts them in assembly- 
competent, native-like conformations from the cytoplasmic membrane to outer membrane 
5 assembly sites comprised of PapC. PapC has been termed a molecular usher since it receives 
chaperone-subunit co-complexes and incorporates, or ushers, the subunits from the chaperone 
co-complex into the growing pilus in a defined order. 

In the absence of an interaction with the chaperone, pilus subunits aggregate and are 
proteolytically degraded. Kolmer et al. and Jones et al. have shown that the DegP protease 
10 degrades pilus subunits in the absence of the chaperone. See J. Bacteriol. 178: 5925 (1996); 
BIBO, 16: 6394 (1997). This discovery led to the elucidation of the fate of pilus subunits 
expressed in the presence or absence of the chaperone using monospecific antisera in Western 
blots of cytosolic membrane, outer membrane and perplasmic proteins prepared according to 
methods known in the art. 
15 Thus, prevention or inhibition of normal pilus assembly in Gram-negative bacterium 

impacts the pathogenicity of the bacterium by preventing the bacterium from attaching to and 
infecting host tissues. Moreover, changes in the binding between pilus subunits and 
chaperones can have a dramatic impact on the efficiency of pilus assembly, and thus on the 
ability of Gram-negative bacterium to adhere to and consequentially, infect host tissues. 
20 Prevention and inhibition of binding between pilus subunits and between pilus subunits and 
periplasmic chaperones have the effect of impairing pilus assembly, whereby the infectivity 
of the Gram-negative bacterium expressing the pili is reduced. Accordingly, a need exists, in 
general, for compositions and methods for preventing or inhibiting the normal interaction 
between pilus subunits and/or between a pilus subunit and a chaperone. 
25 However, identification of such compositions has heretofore relied on serendipity 

and/or systematic screening of large numbers of natural and synthetic compounds. A far 
superior method of drug-screening relies on structure-based drug design. The three 
dimensional structures of proteins or protein fragments are determined and potential agonists 
and/or potential antagonists are designed with the aid of computer modeling. However, 
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heretofore the three-dimensional structure illustrating the interaction between pilus subunits 
and/or between a pilus subunit and a chaperone has remained unknown, essentially because 
no such protein co-crystals had been produced which would permit the required X-ray 
crystallographic data to be obtained. 
5 Therefore, there is presently a need for obtaining a co-crystal of a co-complex of a 

pilus and a chaperone to allow such crystallographic data to be obtained. Furthermore there 
is a need for the determination of the three-dimensional structure of such co-crystals. Finally, 
there is a need for procedures for related structural based drug design based on such 
crystallographic data. 

10 

Summary of the Tnvention 

Accordingly, the present invention provides antibacterial compositions and 
compounds capable of inhibiting or preventing pilus assembly in a Gram-negative bacterium. 
Such compounds interfere with the function of chaperones required for the assembly of pili 

15 from pilus subunits in diverse Gram-negative bacteria. Another object of the invention is to 
provide compounds having antibacterial activity that prevent or inhibit pili assembly by 
interfering with the interactions between pilus subunits. Yet another object of the invention is 
to provide compounds capable of inhibiting or preventing the function of pili adhesion to host 
epithelium thereby reducing the capacity of bacteria to attach to and infect host tissues. It is a 

20 further object of the invention to provide antibacterial compounds which have broad 

specificity for a diverse group of Gram-negative bacteria. Other objects include the provision 
of methods of preventing and inhibiting pilus assembly, methods of preventing or inhibiting 
pili adhesion to host tissues, methods of treating bacterial infections, methods for preventing 
and inhibiting biofilm formation and methods of preventing colonization by various Gram- 

25 negative bacterium. 

Another aspect of the invention is to provide crystalline forms of polypeptides 
corresponding to a pilus chaperone-subunit protein co-complex. Thus, further objects of the 
present invention include the provision of the atomic structure coordinates obtained from the 
pilus chaperone-subunit co-crystals and methods of utilizing the three dimensional structural 
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information obtained from the co-crystals to design or identify compounds with antibacterial 
activity. Another related object is to provide machine- or computer-readable media 
embedded with the three-dimensional structural information obtained from the pilus 
chaperone-subunit co-complex, or portions or subsets thereof which can be used to identify or 
5 design antibacterial compounds. A further object is to provide methods of making the co- 
crystals of the invention. 

Therefore, in one aspect, the present invention is directed to isolated and purified 
compounds and synthesized compounds which bind to a pilus subunit groove and thus inhibit 
pilus assembly. Preferably, such compounds mimic the binding activity of the Gt beta-strand 
10 of a periplasmic chaperone and comprise a polypeptide having an amino acid sequence 
containing at least two alternating hydrophobic amino acid residues. In a preferred 
embodiment, this polypeptide would be derived from a Gi beta-strand of a periplasmic 
chaperone, more preferably, this polypeptide would be comprised of amino acids derived 
from the Nl 01 to LI 07 amino acid region of a G! beta-strand of a periplasmic chaperone. A 
15 particularly preferred antibacterial compound which comprises a peptide comprising an 

amino-terminal amino acid sequence Asn-Val-Leu-Gln-Ile- Ala-Leu (SEQ ID NO: 1) or any 
related analogues that would competitively bind to the binding site of a pilus subunit. 

In another embodiment, such compounds mimic the binding activity of the amino- 
terminal end of a pilus subunit and comprise a polypeptide having an amino acid sequence 
20 containing at least two alternating hydrophobic amino acid residues. Such antibacterial 

compounds will competitively bind to a binding site on pilus subunits, thereby inhibiting or 
preventing pilus assembly. A preferred polypeptide would be derived from the sequences of 
conserved amino-terminal motifs of pilus subunits. A particularly preferred antibacterial 
compound comprises a peptide comprising an amino-terminal amino acid sequence Ser-Asp- 
25 Val-Ala-Phe-Arg-Gly-Asn-Leu-Leu (SEQ ID NO: 12) or any related analogues that would 
competitively bind to the binding site of a pilus subunit. 

A further object of the invention is to provide compounds which mimic mannose by 
binding to the amino-terminal end of the FimH adhesin. Such antibacterial compounds will 
bind to the mannose-binding site on pilus adhesins, thereby inhibiting or preventing the 
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function of the pili to attach to and infect host tissues. 

Interference with pili assembly and prevention of the capacity of pili to attach to host 
tissues are particularly effective since both the formation of pili and attachment of pili to host 
tissues are essential to bacterial pathogenicity. As such, the invention further provides 
compositions containing the above compounds in conjunction with a pharmaceutically- 
acceptable carrier, excipient or diluent. Also provided are methods of preventing or 
inhibiting pilus assembly in a Gram-negative bacterium by administering an effective amount 
of a compound capable of interfering with the binding of pilus subunits and all pilus subunit 
homologues. The invention is also directed to methods of preventing or inhibiting the 
pathogenicity of a Gram-negative bacterium comprising administering an effective amount of 
a compound capable of interfering with the adhesion of pili to host tissues. Further provided 
are methods for treating Gram-negative infections which comprise providing to a subject an 
effective amount of the above compounds and compositions. 

Further, the present invention is directed to methods for preventing or inhibiting 
biofilm formation on a surface or in an environment containing Gram-negative bacteria. Also 
provided are methods for inhibiting bacterial colonization by a Gram-negative organism. 
These methods are accomplished by administering to such surfaces and environments an 
effective amount of a compound or a composition which is capable of interfering with pilus 
assembly or the ability of the pilus to adhere to and subsequently infect host tissues. 

In another aspect, the invention provides compositions comprising crystalline forms 
of polypeptides corresponding to the PapD-PapK chaperone-pilus subunit protein co- 
complex. The PapD-PapK co-crystals comprise crystallized polypeptides corresponding to 
the wild-type or mutated PapD-PapK co-complexes. The PapD-PapK co-crystals preferably 
include native co-crystals, heavy-atom atom derivative co-crystals and co-crystals of a PapD- 
PapK co-complex that is further associated with one or more other molecules or compounds. 
Preferably, such other compounds bind to a site involved in protein-protein interactions in the 
pilus. 

The PapD-PapK co-crystals are generally characterized by a spacegroup of P2 1 2,2 1 , 
and a unit cell of a= 62.1 ± 0.2 A, b= 63.6 ± 0.2 A, c= 92.7 ± 0.2 A, and are preferably of 
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diffraction quality. In a preferred embodiment, the PapD-PapK co-crystals are of sufficient 
quality to permit the determination of the three-dimensional X-ray diffraction structure of the 
crystalline polypeptide co-complex to high resolution, preferably to a resolution of greater 
than about 3 A, typically in the range of about 1 A to about 3 A. 

5 The invention also provides methods of making the co-crystals of the invention. 

Generally, co-crystals of the invention are grown by dissolving substantially pure 
polypeptides in an aqueous buffer that includes a precipitant at a concentration just below that 
necessary to precipitate the polypeptide. Water is then removed by controlled evaporation to 
produce precipitating conditions, which are maintained until co-crystal growth ceases. 

10 In another aspect, the invention provides machine- or computer-readable media 

embedded with the three-dimensional structural information obtained from the PapD-PapK 
co-crystals of the invention, or obtained from FimC-FimH co-crystals, or portions or subsets 
thereof. Such three-dimensional structural information will typically include the atomic 
structure coordinates of the crystallized polypeptide co-complex, or the atomic structure 

15 coordinates of a portion thereof, such as, for example, the atomic structure coordinates of one 
member of the co-complex or an active or binding site of one or both members, but may 
include other structural information, such as vector representations of the atomic structure 
coordinates, etc. 

Thus, the atomic structure coordinates and machine readable media of the invention 
20 have a variety of uses. As such, provided are methods of identifying antibacterial compounds 
which utilize the coordinates for solving the three-dimensional X-ray diffraction and/or 
solution structures of other proteins, including mutant co-complexes, co-complexes further 
associated with other molecules, and unrelated proteins, to high resolution. Structural 
information may also be used in a variety of molecular modeling and computer-based 
25 screening applications to, for example, intelligently design mutants of the crystallized PapD- 
PapK or FimC-FimH co-complexes having altered biological activity and to computationally 
design and identify compounds that bind the polypeptide co-complexes or a portion or 
fragment of the polypeptide co-complexes, such as the mannose binding site of FimH and/or 
the G, beta strand binding cleft of PapK. 
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In another aspect, the present invention provides methods of using the coordinates of 
the PapD-PapK co-complex or of the FimC-FimH co-complex, or subsets of such structure 
coordinates, to design or identify candidate compounds capable of binding to a binding site 
on one member of the co-complex, or of a member of a related co-complex. Such candidate 
compounds may be evaluated for biological activity, such as, for example, the ability to bind 
(preferably competitively) the subunit of interest, the ability to disrupt chaperone-pilus 
subunit assembly and/or the ability to avoid adherence of a Gram-negative bacterium to a 
host tissue. In one embodiment, the co-crystals from which the PapD-PapK co-complex 
structure is derived have the space group and cell dimensions described above, such that the 
three dimensional structure of the co-complex is provided to a resolution of from about 3.0 A 
to about 2.4 A or greater. In another embodiment, the co-crystals from which the FimC- 
FimH co-complex structure is derived have the space group P4A2 or P4 3 with unit cell 
dimensions of a=b= 97.7 +/- 0.2 A and c= 215.9 +/- 0.2 A, such that the three dimensional 
structure of the co-complex can be determined to a resolution of from about 3.0 A to about 
2.5 A or greater. 

In a further aspect of the invention, such potential compounds are evaluated for 
biological activity. Candidate antibacterial compounds are designed or identified using the 
atomic structure coordinates of the PapD-PapK or FimC-FimH co-complexes or subsets 
thereof, synthesized and screened for their ability to bind to pilus subunits, thereby inhibiting 
or preventing pilus biogenesis. The antibacterial activity of the compound is determined by 
assaying the bacterium for infectivity or monitoring the pilus for activity. Alternatively, 
compounds designed or identified based upon their ability to bind the mannose binding 
domain of FimH are synthesized and screened for their ability to bind FimH. Such 
compounds that are able to prevent or inhibit pilus biogenesis or the ability of the bacterial 
pilus to attach to a host tissue can be used in the compositions of the present invention. 



Other objects and features will be in part apparent and in part pointed out hereinafter. 
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Brief Description of Figures 
FiglA is a depiction of representative regions of the electron density of a PapD Gj 
beta-strand. Electron density is from a simulated annealing omit map calculated using the 
phases derived from the final model where the PapD G, beta-strand residues 101 to 108 have 
been omitted. Strands are labeled. 

Fig IB is a depiction of representative regions of electron density shown in PapD G, 
beta-strand zippering to the PapK F strand. The density is from a map calculated using 
unbiased experimental MAD solvent-flattened phases. 

Fig 1C is a view from the hydrophobic core of PapK looking out toward the PapD G, 
beta-strand that inserts into the groove of the subunit. Residues throughout are labeled. The 
density is from a map calculated using unbiased experimental MAD solvent-flattened phases. 

Fig. 2 A is a schematic of a stereo ribbon diagram. Subscripts 1 and 2 refer to 
domains 1 and 2 of PapD, respectively. 

Fig 2B is a stereo ribbon diagram. The molecular surface of PapK, calculated and 
displayed using GRASP. The structure of PapD is shown as a ribbon. The insertion of the G t 
beta-strand of PapD into a deep groove on the surface of PapK can be seen. 

Fig. 3 A is the topology of PapK. Beta-strands are indicated as arrows, while helices 
(either a or 3 10 ) are shown as cylinders. 

Fig. 3B is a depiction of the sequence alignment of P-pilus subunits (Pap A, PapK, 
PapE, and PapF). The secondary structural elements of PapK are indicated above the aligned 
sequences. Residue numbers of PapK are indicated above the PapK sequence. The 
remarkable conservation of structurally and functionally important residues strongly indicates 
that all pilins have structures similar to PapK. 

Fig. 3C is a depiction of the secondary structure definition of PapD. Residue numbers 
are indicated above the sequence, while secondary structural elements are indicated below it. 

Fig. 4 depicts the superposition of the structures of apo-PapD and PapD complexed to 
PapK The arrow indicates the conformational change in the V r G l loop upon subunit 
binding. 
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Fig. 5 is the definition of the binding sites in PapD and PapK. On the left, PapD is 
shown as a space-filling model and PapK as a ribbon. On the right, PapK is shown as a 
space-filling model and PapD as a ribbon. The various binding sites as defined in the text are 
labeled. 

Fig. 6A is a schematic of a stereo contact diagram of interactions between PapD and 
the NH 2 -terminus of PapK. Residues making contacts are shown in stick representation (thin 
for PapD, and thick for PapK). 

Fig. 6B is a schematic of a stereo contact diagram of interactions between PapD and 
the COOH-terminal F strand of PapK. The NH 2 -terminal strand A and the COOH-terminal 
strand F form the sides of the groove in PapK. Residues making contacts are shown in stick 
representation (thin for PapD, and thick for PapK). 

Fig. 6C is a schematic of a stereo contact diagram of interactions between PapK and 
domain 2 of PapD. Residues making contacts are shown in stick representation (thin for 
PapD, and thick for PapK). 

Fig. 6D is a schematic of a stereo contact diagram of interactions between the C- 
terminal carboxylate of PapK with PapD. Residues making contacts are shown in stick 
representation (thin for PapD, and thick for PapK). 

Fig. 6E is a depiction of the G t beta-strand of PapD as it inserts into the groove of 
PapK. The PapD G 1 strand is represented as a stick model with color coding as in Fig. 6A 
and PapK is shown as a molecular surface calculated using GRASP. Notice the 
predominance of hydrophobic residues in the groove, the base of which is part of the 
hydrophobic core of the protein. 

Fig. 7A is a schematic diagram of subunit-subunit interactions in pilus rod model as 
viewed from above. Insertion of the NH 2 -terminal strand of one subunit into the groove made 
by the A2 and F strands of the preceding subunit such that the NH 2 -terminal strand is parallel 
to strand F results in a three-pointed-star-shaped cross-section inconsistent with electron 
microscopy data. Strands (arrows) are labeled, as are the NH 2 - and COOH-termini (N and C 
respectively). Hydrogen bonding interactions are shown schematically. 
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Fig. 7B is a schematic diagram of subunit-subunit interactions inpilus model as 
viewed from above. Insertion of the NH 2 -terminal strand antiparallel to strand F yields a 
cross-section compatible with electron microscopy data Strands (arrows) are labeled, as are 
the NH 2 - and COOH-termini (N and C respectively). Hydrogen bonding interactions are 
shown schematically. 

Fig. 7C is a molecular surface of a pilus rod (program GRASP). The disordered 
residues at the NH 2 -terminus of the subunit were modeled as a strand that inserts into the 
groove of the preceding subunit. Approximately three turns of the model pilus, whose 
dimensions are similar to the known values from electron microscopy are shown. 

Fig. 7D is a stereo ribbon diagram of the rod model. The insertion of the NH 2 - 
terminal strand of one subunit into the groove of the preceding subunit can be clearly seen. 

Fig. 8 A depict the amino acid sequences of type 1 pilus subunits (FimA, FimF, FimG, 
FimH). The end of the mannose binding lectin domain and the start of the pilin domain in 
FimH are indicated by vertical arrows above the sequences. Type 1 pilin subunits (FimA, 
FimF, FimG) were aligned with the pilin domain of FimH using Clustal W and manually 
adjusted to minimize gaps in secondary structure elements. Gaps in the alignment are 
indicated by dots. Sequence numbering for FimH starts at position 22 in the pre-protein. 
Residues involved in chaperone binding are indicated by an open circle above the residue. 
Residues in the carbohydrate binding pocket are boxed. A large box marks the NH 2 -terminal 
extensions in the pilin subunits. The conserved b-zipper motif found in all pilin subunits 
corresponds to the F beta-strand. Limits and nomenclature for secondary structure elements 
are shown below the sequence. 

Fig 8B are beta-sheet topology diagrams of the mannose binding domain (left) and 
pilin domain (right) of FimH. 

Fig 9 A is a typical sample of the solvent flattened experimental electron density map 
(contoured at 1.0a) with the refined model superimposed. Arg 8C and Lys 112C anchor the 
COOH-terminus of FimH in the subunit binding cleft of the chaperone via hydrogen bonds to 
the terminal carboxylate. 
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Fig. 9B is a MOLSCRIPT ribbon diagram of the FimC-FimH co-complex. A ball- 
and-stick representation of the C-HEGA molecule bound to the lectin domain of FimH 
indicates the position of the carbohydrate-binding site at the tip of the domain. 

Fig. 1 OA is a depiction of FimH carbohydrate binding. A stereo view of the 
carbohydrate binding pocket with a molecule of C-HEGA bound. Residues Phe 1H , Ile 13H , 
Asn 46H , Asp 47H , Tyr 48H , Ile 52H , Asp 54H , Gln 133H , Asn 135H , Tyr 137H , Asn 138H , Asp 140H , Phe 142H line the 
surface of the pocket at the tip of the lectin domain is shown. Residues that take part in 
hydrogen bonding to the glucamide moiety of C-HEGA are labeled. 

Fig. 10B is a depiction of the surface of the FimH pilin domain showing the exposed 
hydrophobic core. Hydrophobic residues that are in contact with FimC in the co-complex but 
solvent exposed upon removal of the chaperone are highlighted in yellow. Right: as left but 
with FimC ribbon in blue. The seventh Gl strand of FimC donates hydrophobic residues to 
complement the incomplete hydrophobic core of the pilin domain. 

Fig. 10C is a close-up of donor strand complementation interactions. Hydrophobic 
residues on the surface of the pilin domain (Val 163H , Ala 165H , Thr 169H , Ile 181H , Leu 183H , Val 223H , 
Leu 225H , Ile 272H , Val 274H , and Phe 276H ) and FimC residues involved in donor strand 
complementation (Leu 103C , Leu 105C , Ile 107C , Ser 109C , Ile mc ) pack against each other to form a 
complete hydrophobic core extending between the two proteins. 
Fig. 11 A is a model of the type 1 pilus. 

Fig. 1 IB is a top view of the type 1 pilus. Residue positions that are subject to allelic 
variation map to the outer surface of the pilus. 

Fig. 11C is a side view of the type 1 pilus. 

Fig. 12 is a graphic representing the binding of FimH to polypeptides corresponding 
to the Gl beta-strand of FimC and the N-terminal extension of FimC. The two polypeptides 
or FimC were coated onto microtiter wells and FimH binding to the immobilized 
polypeptides or FimC protein was determined by ELISA using anti-FimH antibodies. The 
graph represents the average of triplicate wells with the standard deviation shown in bars. 

Fig. 13 is a graph which represents the binding of FimH in the presence of increasing 
concentrations of the FimC polypeptide. It can be seen that FimC polypeptides inhibit FimH 
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binding to FimC. The graphs represent the average of triplicate wells with the standard 
deviation shown in bars. 

Fig. 14 is a graph which represents the FimH binding to FimC in the presence or 
absence of FimG or FimC polypeptides as monitored by ELISA. The graphs represent the 
5 average of triplicate wells with the standard deviation shown in bars. 

Abbreviations and Definitions 

To facilitate understanding of the invention, a number of terms are defined below: 
The amino acid notations used herein for the twenty genetically encoded L-amino 
10 acids are conventional and are abbreviated as follows: 



Amino Acid 


One-Letter 
Symbol 


Three-Letter 
Symbol 


Alanine 


A 


Ala 


Arginine 


R 


Arg 


Asparagine 


N 


Asn 


Aspartic acid 


D 


Asp 


Cysteine 


C 


Cys 


Glutamine 


Q 


Gin 


Glutamic acid 


E 


Glu 


Glycine 


G 


Gly 


Histidine 


H 


His 


Isoleucine 


I 


lie 


Leucine 


L 


Leu 


Lysine 


K 


Lys 
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Amino Acid 


One-Letter 
Symbol 


Three-Letter 
Symbol 


Ivtsthionine 


M 


Met 


Phenylalanine 


F 


Phe 


Proline 


P 


Pro 




S 


Ser 


Threonine 


T 


Thr 


Tryptophan 


W 


Trp 


Tyrosine 


Y 


Tyr 


Valine 


V 


Val 



As used herein, unless specifically delineated otherwise, the three-letter and one-letter 
amino acid abbreviations designate amino acids in either the D-configuration or the L- 
configuration. For example, Arg designates D-arginine and L-arginine, and R designates D- 
5 arginine and L-arginine. 

Unless noted otherwise, when polypeptide sequences are presented as a series of one- 
letter and/or three-letter abbreviations, the sequences are presented in the N ^ C direction, in 
accordance with common practice. As used herein, "C" refers to the alpha carbon of an 
amino acid residue. 

10 For purposes of determining conservative amino acid substitutions in the various 

polypeptides described herein and for describing the various peptide and peptide analog 
compounds, the amino acids can be conveniently classified into two main categories - 
hydrophilic and hydrophobic- depending primarily on the physical-chemical characteristics 
of the amino acid side chain. These two main categories can be further classified into 

1 5 subcategories that more distinctly define the characteristics of the amino acid side chains. 

For example, the class of hydrophilic amino acids can be further subdivided into acidic, basic 
and polar amino acids. The class of hydrophobic amino acids can be further subdivided into 
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apolar and aromatic amino acids. The definitions of the various categories of amino acids are 
as follows: 

"Hydrophilic amino acid" refers to an amino acid exhibiting a hydrophobicity of less 
than zero according to the normalized consensus hydrophobicity scale of Eisenberg et al., 
5 1984, J. Mol. Biol. 179:125-142. Genetically encoded hydrophilic amino acids include Thr 
(T), Ser (S), His (H), Glu (E), Asn (N), Gin (Q), Asp (D), Lys (K) and Arg (R). 

"Acidic amino acid" refers to a hydrophilic amino acid having a side chain pK value 
of less than 7. Acidic amino acids typically have negatively charged side chains at 
physiological pH due to loss of a hydrogen ion. Genetically encoded acidic amino acids 
10 include Glu (E) and Asp (D). 

"Basic amino acid" refers to a hydrophilic amino acid having a side chain pK value of 
greater than 7. Basic amino acids typically have positively charged side chains at 
physiological pH due to association with hydronium ion. Genetically encoded basic amino 
acids include His (H), Arg (R) and Lys (K). 
1 5 "Polar amino acid" refers to a hydrophilic amino acid having a side chain that is 

uncharged at physiological pH, but which has at least one bond in which the pair of electrons 
shared in common by two atoms is held more closely by one of the atoms. Genetically 
encoded polar amino acids include Asn (N), Gin (Q) Ser (S) and Thr (T). 

"Hydrophobic amino acid" refers to an amino acid exhibiting a hydrophobicity of 
20 greater than zero according to the normalized consensus hydrophobicity scale of Eisenberg, 
1984, J. Mol. Biol. 179:125-142. Genetically encoded hydrophobic amino acids include Pro 
(P), lie (I), Phe (F), Val (V), Leu (L), Trp (W), Met (M), Ala (A), Gly (G) and Tyr (Y). 

" Aromatic amino acid" refers to a hydrophobic amino acid with a side chain having at 
least one aromatic or heteroaromatic ring. The aromatic or heteroaromatic ring may contain 
25 one or more substituents such as -OH, -SH, -CN, -F, -CI, -Br, -I, -N0 2 , -NO, -NH 2 , -NHR, 
-NRR, -C(0)R, -C(0)OH, -C(0)OR, -C(0)NH 2 , -C(0)NHR, -C(0)NRR and the like where 
each R is independently (C,-C 6 ) alkyl, substituted (C,-C 6 ) alkyl, (C 2 -C 6 ) alkenyl, substituted 
(C 2 -C 6 ) alkenyl, (C 2 -C 6 ) alkynyl, substituted (C 2 C 6 ) alkynyl, (C 5 -C 20 ) aryl, substituted (C 5 -C 20 ) 
aryl, (C 6 -C 26 ) arylalkyl, substituted (C 6 -C 26 ) arylalkyl, 5-20 membered heteroaryl, substituted 
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5-20 membered heteroaryl, 6-26 membered heteroarylalkyl or substituted 6-26 membered 
heteroarylalkyl. Genetically encoded aromatic amino acids include His (H), Phe (F), Tyr (Y) 
and Trp (W). 

"Polar amino acid" refers to a hydrophobic amino acid having a side chain that is 
uncharged at physiological pH and which has bonds in which the pair of electrons shared in 
common by two atoms is generally held equally by each of the two atoms (i.e., the side chain 
is not polar). Genetically encoded apolar amino acids include Leu (L), Val (V), lie (I), Met 
(M), Gly (G) and Ala (A). 

"Aliphatic amino acid" refers to a hydrophobic amino acid having an aliphatic 
hydrocarbon side chain. Genetically encoded aliphatic amino acids include Ala (A), Val (V), 
Leu (L) and lie (I). 

"Hydroxyl-substituted aliphatic amino acid" refers to ahydrophilic polar amino acid 
having a hydroxyl-substituted side chain. Genetically-encoded hydroxyl-substituted aliphatic 
amino acids include Ser (S) and Thr (T). 

The amino acid residue Cys (C) is unusual in that it can form disulfide bridges with 
other Cys (C) residues or other sulfanyl-containing amino acids. The ability of Cys (C) 
residues (and other amino acids with -SH containing side chains) to exist in a peptide in either 
the reduced free -SH or oxidized disulfide-bridged form affects whether Cys (C) residues 
contribute net hydrophobic or hydrophilic character to a peptide. While Cys (C) exhibits a 
hydrophobicity of 0.29 according to the normalized consensus scale of Eisenberg (Eisenberg, 
1984, supra), it is to be understood that for purposes of the present invention Cys (C) is 
categorized as a polar hydrophilic amino acid, notwithstanding the general classifications 
defined above. 

As will be appreciated by those of skill in the art, the above-defined categories are not 
mutually exclusive. Thus, amino acids having side chains exhibiting two or more physical- 
chemical properties can be included in multiple categories. For example, amino acid side 
chains having aromatic moieties that are further substituted with polar substituents, such as 
Tyr (Y), may exhibit both aromatic hydrophobic properties and polar or hydrophilic 
properties, and can therefore be included in both the aromatic and polar categories. As 
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another example, His (H) has a side chain that falls within the aromatic and basic categories. 
The appropriate categorization of any amino acid will be apparent to those of skill in the art, 
especially in light of the detailed disclosure provided herein. 

While the above-defined categories have been exemplified in terms of the genetically 
5 encoded amino acids, the amino acid substitutions need not be, and in certain embodiments 
preferably are not, restricted to the genetically encoded amino acids. Indeed, since many of 
the compounds described herein may be produced synthetically, they may comprise one or 
more genetically non-encoded amino acids. Thus, in addition to the naturally occurring 
genetically encoded amino acids, amino acid residues in the core peptides of structure (I) may 
10 be substituted with naturally occurring non-encoded amino acids and synthetic amino acids. 

Certain commonly encountered amino acids of which the compounds of the invention 
may be comprised include, but are not limited to, 3-alanine flS-Ala) and other omega-amino 
acids such as 3-aminopropionic acid, 2,3-diaminopropionic acid (Dpr), 4-aminobutyric acid 
and so forth; a-aminoisobutyric acid (Aib); e-aminohexanoic acid (Aha); 5-aminovaleric 
15 acid (Ava); N-methylglycine or sarcosine (MeGly); ornithine (Orn); citrulline (Cit); 

t-butylalanine (t-BuA); t-butylglycine (t-BuG); N-methylisoleucine (Melle); phenylglycine 
(Phg); cyclohexylalanine (Cha); norleucine (Nle); naphthylalanine (Nal); 4- 
chlorophenylalanine (Phe(4-Cl)); 2-fluorophenylalanine (Phe(2-F)); 3-fluorophenylalanine 
(Phe(3-F)); 4-fluorophenylalanine (Phe(4-F)); penicillamine (Pen); 1,2,3,4- 
20 tetrahydroisoquinoline-3-carboxylic acid (Tic); (3-2-thienylalanine (Thi); methionine 

sulfoxide (MSO); homoarginine (hArg); N-acetyl lysine (AcLys); 2,4-diaminobutyric acid 
(Dbu); 2,3-diaminobutyric acid (Dab); p-aminophenylalanine (Phe(pNH 2 )); N-methyl valine 
(MeVal); homocysteine (hCys), homophenylalanine (hPhe) and homoserine (hSer); 
hydroxyproline (Hyp), homoproline (hPro), N-methylated amino acids andpeptoids (N- 
25 substituted glycines). 

The classifications of the genetically encoded and common non-encoded amino acids 
according to the categories defined above are summarized in Table 1, below. It is to be 
understood that Table 1 is for illustrative purposes only and does not purport to be an 
exhaustive list of amino acid residues that can be used in the invention. Additional amino 
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acids may be found in Fasman, 1989, Practical Handbook of Biochemistry and Molecular 
Biology, CRC Press, Inc., pp. 3-70, and the references cited therein. 

TABLE 1: CLASSIFICATIONS OF COMMONLY ENCOUNTERED AMINO ACIDS 



Classification 


Genetically 


Non-Genetically 




Encoded 


Encoded 


Hydrophobic 






Aromatic 


H, F, Y, W 


Phg, Nal, Thi, Tic, Phe(4-Cl), Phe(2-F), 






Phe(3-F), Phe(4-F), hPhe 


Apolar 


L, V, I, M, G, A, P 


t-BuA, t-BuG, Melle, Nle, MeVal, Cha, 




McGly, Aib 


Aliphatic 


A, V, L, I 


b-Ala, Dpr, Aib, Aha, MeGly, t-BuA, 




t-BuG, Melle, Cha, Nle, MeVal 


Hydrophilic 






Acidic 


D, E 




Basic 


H,K,R 


Dpr, Orn, hArg, Phe(p-NH 2 ), Dbu, Dab 


Polar 


C, Q, N, S, T 


Cit, AcLys, MSO, bAla, hSer 



As utilized herein, the term "pilus" or "pili" relates to fibrillar heteropolymeric 
structures embedded in the cell envelope of many tissue-adhering pathogenic bacteria, 
notably pathogenic gram negative bacteria. In the present specification, the terms pilus and 

10 pili will be used interchangeably. A pilus is composed of a number of "pilus subunits" which 
constitute distinct functional parts of the intact pilus. 

The term "chaperone" relates to a molecule which in living cells has the responsibility 
of binding to polypeptides in order to mature the polypeptides in a number of ways. Many 
molecular chaperones are involved in the process of folding polypeptides into their native 

1 5 conformations whereas other molecular chaperones are involved in the export out of or 
import into the cell of polypeptides. Specialized molecular chaperones are "periplasmic 
chaperones" which are bacterial molecular chaperones exerting their main actions in the 
"periplasmic space." Specialized periplasmic chaperones also have an immunoglobulin-like 
three dimensional structure. The periplasmic space constitutes the space in between the inner 
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and outer bacterial membrane. Periplasmic chaperones are involved in the process of correct 
assembly of intact pili structures. When used herein, the use of the term "chaperone" 
designates a molecular, periplasmic chaperone unless otherwise indicated. 

The phrase "preventing or inhibiting binding between pilus subunits and a periplasmic 
5 chaperone" indicates that the normal interaction between a chaperone and its natural ligand, 
i.e., the pilus subunit, is being affected either by being inhibited, expressed in another 
manner, or reduced to such an extent that the binding of the pilus subunit to the chaperone is 
measurably lower than is the case when the chaperone is interacting with the pilus subunit at 
conditions which are substantially identical (with regard to pH, concentration of ions, and 
10 other molecules) to the native conditions in the periplasmic space. Measurement of the 

degree of binding can be determined in vitro by methods known to the person skilled in the 
art (microcalorimetry, radioimmunoassays, enzyme based immunoassays, etc.). 

The phrase "preventing or inhibiting binding between pilus subunits" generally 
indicates that the normal interaction between pilus subunits is being affected either by being 
15 inhibited, expressed in another manner, or reduced to such an extent that the binding of a 
pilus subunit to another pilus subunit is measurably lower than is the case when the pilus 
subunits are interacting at conditions which are substantially identical (with regard to pH, 
concentration of ions, and other molecules) to the native conditions during pilus assembly. 
This phrase can apply to the dissociation of pre-formed pilus subunit-subunit interactions 
20 during pilus assembly. Measurement of the degree of binding can be determined in vitro by 
methods known to the person skilled in the art (microcalorimetry, radioimmunoassays, 
enzyme based immunoassays, etc.). 

The compounds and compositions of the present invention which prevent or inhibit 
binding between pilus subunits or between a pilus chaperone or subunit are said to exhibit 
25 "antibacterial activity." 

By the term "subject in need thereof is in the present context meant a subject, which 
can be any plant or animal, including a human being, who is infected with, or is likely to be 
infected with, tissue-adhering pilus-forming bacteria which are believed to be pathogenic. 
By the term "an effective amount" is meant an amount of the substance in question 
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which will in a majority of patients have either the effect that the disease caused by the 
pathogenic bacteria is cured or ameliorated or, if the substance has been given 
prophylactically, the effect that the disease is prevented from manifesting itself. The term "an 
effective amount" also implies that the substance is given in an amount which only causes 
5 mild or no adverse effects in the subject to whom it has been administered, or that the adverse 
effects may be tolerated from a medical and pharmaceutical point of view in the light of the 
severity of the disease for which the substance has been given. 

As used herein, "treatment" includes both prophylaxis and therapy. Thus, in treating a 
subject, the compounds of the invention may be administered to a subject already harboring a 
10 bacterial infection or in order to prevent such infection from occurring. 

By the term "a mimic of a pilus subunit" is meant a compound which has been 
established to bind to a chaperone or to another pilus subunit in a manner which is 
comparable to the way the pilus subunit binds to the chaperone or to the way that the pilus 
subunits bind to each other, respectively. 
15 The terms "an analogue of a G x beta-strand of a periplasmic chaperone" or "a mimic 

of a G l beta-strand of a periplasmic chaperone" denotes any substance which mimics or has 
the ability to bind to at least one pilus subunit in a manner which corresponds to the binding 
of a chaperone to a pilus subunit in the periplasmic space. Such an analogue or mimic of the 
chaperone can be a modified form of the intact chaperone (e.g. one of the two domains of 
20 PapD) or it can be a modified form of the chaperone which may e.g. be coupled to a probe, 
marker or another moiety. Another such analogue or mimic can be obtained by modifying or 
mutating the G, beta strand of the periplasmic chaperone so that it differs from the wild-type 
sequence by the substitution of at least one amino acid residue of the wild-type sequence with 
a different amino acid residue and/or by the addition and/or deletion of one or more amino 
25 acid residues to or from the wild-type sequence. The additions and/or deletions can be from 
an internal region of the wild-type sequence and/or at either or both of the N- or C-termini. In 
the present context, the pilus subunit, mimic or analogue thereof exhibits at least one binding 
characteristic relevant for the assembly of pili. 
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In the present context the terms "an analogue of a pilus subunit" and "a mimic of a 
pilus subunit" should be understood, in a broad sense, to mean any substance which mimics 
(with respect to binding characteristics) an effective part of a pilus subunit (e.g. the amino- 
terminal portion of the pilus subunit). Thus, the analogue or mimic may simply be any other 

5 compound regarded as capable of mimicking the binding between pilus subunits in vivo or in 
vitro. In the present context, the pilus subunit, mimic or analogue thereof exhibits at least one 
binding characteristic relevant for the assembly of pili. 

In the present context the terms "a mannose analogue" or "a mannose mimic" should 
be understood, in a broad sense to mean any substance which mimics (with respect to binding 

10 characteristics) the mannose sugar which binds to an effective part of the FimH adhesin (e.g., 
the NH 2 terminal mannose-binding domain). Thus, the analogue or mimic may simply be any 
other compound regarded as capable of mimicking the binding of a mannose-oligosaccharide 
to FimH adhesin in vivo or in vitro. In the present context, the mannose analogue or mannose 
mimic exhibits at least one binding characteristic relevant for the adhesion of pili. 

15 The term "donor stand complementation" refers to the mechanism by which a 

chaperone donates its G r beta-strand to complete the fold of a pilus subunit. 

The term "donor strand exchange" refers to the mechanism by which the amino- 
terminal extension of a pilus subunit displaces the Gj beta-strand of a pilus chaperone and 
subsequently occupies the subunit groove previously occupied by the G 1 beta-strand. 

20 The term "crystallized PapD-PapK chaperone-subunit co-complex" refers to a 

polypeptide co-complex having an amino acid sequence as set out in SEQ ID NO: 1 and SEQ 
ID NO: 12 and which is in crystalline form. 

The term "crystal" refers to a composition comprising a polypeptide in crystalline 
form. The term "crystal" includes native crystals, heavy-atom derivative crystals and co- 

25 crystals, as defined herein. 

The term "native crystal" refers to a crystal wherein the polypeptide is substantially 
pure. As used herein, native crystals do not include crystals of polypeptides comprising 
amino acids that are modified with heavy atoms, such as crystals of selenomethionine 
mutants, selenocysteine mutants, etc. 
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The term "heavy-atom derivative crystal" refers to a crystal wherein the polypeptide is 
in association with one or more heavy-metal atoms. As used herein, heavy-atom derivative 
crystals include native crystals into which a heavy metal atom is soaked, as well as crystals of 
selenomethionine mutants and selenocysteine mutants. 

The term "co-complex" refers to a polypeptide in association with one or more 
additional polypeptides or other molecules. For example, the PapD-PapK and FimC-FimH 
assemblies are co-complexes. 

The term "co-crystal" refers to a composition comprising a co-complex, as defined 
above, in crystalline form. Co-crystals include native co-crystals and heavy-atom derivative 
co-crystals. 

The term "unit cell" refers to the smallest and simplest volume element (i.e., 
parallelpiped-shaped block) of a crystal that is completely representative of the unit or pattern 
of the crystal. The dimensions of the unit cell are defined by six numbers: dimensions a, b 
and c and angles a, P and y (Blundel et al, 1976, Protein Crystallography, Academic Press.). 
A crystal is an efficiently packed array of many unit cells. 

The phrase "having substantially the same three-dimensional structure" refers to a 
polypeptide that is characterized by a set of atomic structure coordinates that have a root 
mean square deviation (r.m.s.d.) of less than or equal to about 2 A when superimposed onto 
the atomic structure coordinates of Tables 4 or 5 when at least about 50% to 100% of the C a 
atoms of the coordinates are included in the superposition. 

Detailed Description of th e Invention 

In accordance with the present invention, applicants have designed and fabricated 
compounds which mimic components of chaperones such as PapD and pilus subunits such as 
PapK, and which thereby function to interfere with pilus assembly. Specifically, applicants 
have devised compounds and methods which interfere with the binding of a chaperone or a 
pilus subunit to a pilus subunit which will thus interfere with the formation of intact pili, 
thereby reducing the capacity of bacteria to adhere to host epithelium. Further, applicants 
have devised compounds which interfere with the adhesion of FimH adhesin to mannose 
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oligosaccharides located on the host epithelium thereby reducing the capacity of piliated 
bacteria to attach to and infect host tissues. Applicants have further demonstrated that 
prevention or inhibition of pilus assembly in Gram-negative pathogens can be accomplished 
in a number of ways. 

5 The co-crystal structure of PapD has been resolved and refined to a 2.0 angstrom 

resolution, revealing a molecule with two immunoglobulin-like domains oriented in an L 
shape to form a cleft at their interface. See A. Holmgren and C.E. Brenden, Nature, 342:248 
(1989). The chaperone cleft contains surface-exposed residues that are highly conserved. 
Each immunoglobulin-like domain has a beta-barrel structure formed by two antiparallel 

10 beta-pleated sheets with an overall topology similar to an immunoglobulin fold. Applicants 
have resolved the co-crystal structure of the PapD-PapK chaperone-subunit co-complex 
which reveals how PapD stabilizes pilus subunits in the periplasm. Further, a combination of 
genetic, biochemical, and crystallographic data has demonstrated that the Gj beta-strand of 
PapD forms a beta-zipper interaction with the highly conserved COOH-terminal motif of 

15 pilus subunits. See Hung, et al, EMBO J. 15:3792 (1996); Kuehn et al., Science 262: 1234 
(1993); Soto et al., EMBO J. 17:6155 (1998). This COOH-terminal motif also comprises at 
least part of a primary surface for subunit-subunit assembly interactions, indicating that the 
direct capping of a primary assembly surface is part of the molecular basis by which 
periplasmic chaperones prevent the premature oligomerization of pilus subunits. In addition, 

20 it is believed that the beta-zipper interaction facilitates the folding of the subunit into a native- 
like conformation via a template-mediated mechanism. 

Applicants have solved the three dimensional co-crystal structure of a FimC-FimH 
chaperone- adhesin co-complex from uropathogenic E. coli. See Choudhury et al., Science 
285: 1061 (1999). This molecular mechanism is supported by this structure. Specifically, 

25 applicants have demonstrated that in the FimC-FimH co-complex, the seventh (GJ strand 
from the NH 2 -terminal domain of the chaperone is used to complement the pilin domain 
between the second half of the A strand and the F strand of the domain. As such, the F strand 
of FimH forms a parallel beta-strand interaction with the G 1 beta-strand of FimC and has its 
COOH-terminal carboxyl group anchored in the crevice of the chaperone cleft of FimC. 
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Thus, applicants have elucidated the mechanism of binding between PapD and the 
pilus subunit PapK, thereby identifying an essential part of a defined binding site responsible 
for the binding between pilus subunits as well as binding between pilus subunits and their 
periplasmic chaperones. Furthermore, applicants have utilized the PapD-PapK co-crystal 
structure, the first of such a co-complex, and the FimC-FimH co-crystal structure to provide 
further insights into the processes of subunit folding, capping, and assembly in the 
chaperone/usher pathway of pilus biogenesis, and thereby devised compounds, compositions 
and methods for the prevention and inhibition of pilus formation. 

Furthermore, applicants have elucidated the mannose binding domain of the FimH 
adhesin which is responsible for mediating the binding of pili to mannose receptors on host 
cells. As demonstrated further in the examples, a pocket capable of accommodating a mono- 
mannose unit is located at the tip of the lectin domain of the FimH adhesin. Applicants have 
utilized the identification of this mannose-binding site to design compounds and 
compositions which would function to interfere with pilus attachment to epithelial tissues 
thereby inhibiting or preventing the ability of the bacterium to infect host tissues. 

PapD-PapK Chaperone -Subunit Co-Complex 

An important aspect of the PapD-PapK chaperone-subunit co-complex is the structure 
of the PapK subunit. PapK has an immunoglobulin-like fold; however, it lacks the canonical 
seventh beta-strand and in its place is a deep groove located on the surface of the PapK 
subunit. The base of the groove on the surface of the PapK subunit is formed by the 
hydrophobic core of the protein. From the resolved co-crystal structure of the PapD-PapK 
chaperone-subunit co-complex, it can be seen that the Gj beta-strand of the chaperone 
occupies this groove and prevents the exposure of the hydrophobic core of the subunit, which 
would lead to the destabilization and degradation of the subunits. 

Moreover, the PapD-PapK chaperone-subunit co-complex provides further insight 
into the mechanism by which pilus subunits assemble to form a mature, intact pilus. The 
eight amino acids located on the amino-terminus of PapK are disordered and presumably 
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project away from the co-complex. These residues contain a pattern of alternating 
hydrophobic residues typical of a beta-strand which is conserved in pilus subunits. Thus, 
while not being bound to a particular theory, it is believed that in the mature pilus, the amino- 
terminal residues of one subunit occupy the groove of the adjacent subunit. 

In the PapD-PapK co-complex structure, strand F of PapK forms one side of the 
groove into which the G l beta-strand of the chaperone is inserted and is likely to assume the 
same structural role in pilins. Structural, biochemical and genetic data have demonstrated 
that strand F (and hence the groove) in pilins is involved in both chaperone-subunit and 
subunit-subunit interactions. By donating a secondary structural element to the fold of the 
pilin, the chaperone not only contributes to the stability of the pilin but also prevents other 
pilins in the periplasm from binding to the groove of the chaperone-bound subunit. 

The amino-terminal region of pilins, corresponding to the disordered amino-terminus 
of PapK, has also been shown to form an assembly surface on the pilin. The eight NH 2 - 
terminal residues are disordered in the PapD-PapK co-complex and protrude away from the 
main body of the co-crystal structure where they would be free to interact with the groove of 
the preceding subunit located at the usher. The amino-terminus of an incoming subunit 
inserts into the groove of the preceding subunit, displacing the G t beta-strand of the 
chaperone in a mechanism that is facilitated by the usher. Applicants refer to this mechanism 
as "donor strand exchange". Donor strand exchange implies that in the pilus, the NH 2 - 
terminal strand of one subunit would complete the immuno globulin-like fold and protect the 
hydrophobic core of the preceding subunit, much as the chaperone does in the periplasm. 

A donor strand exchange model for pilus assembly employing a PapK structure was 
utilized to model a PapA pilus rod. Pilus rods are well-ordered helical structures with a 
diameter of 68 A, a pitch of 24.9 A, and 3.28 subunits per turn. The disordered NH 2 -terminus 
of PapK was modeled as a beta-strand protruding from the Ig fold at an angle consistent with 
the ordered portion of the NH 2 -terminus in the structure, and inserted into the groove of the 
preceding subunit. A pilus rod with the appropriate general features and without steric 
clashes could be built by applying identical translational and rotational operations to 
successive subunits. The model pilus has a 72 A diameter, a pitch of approximately 22 A, 
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and approximately 3.3 subunits per turn, similar to the actual dimensions of the pilus rod 
(Fig. 7). However, the model has an unexpected feature: the NH2-terminal strand of one 
subunit runs antiparallel (not parallel as does the G, beta-strand of PapD) to strand F of the 
preceding subunit. A parallel beta-strand interaction with strand F of the preceding subunit 
would produce a rod with a star-shaped cross-section (Figs. 7A and 7B), inconsistent with the 
electron microscopy data. Thus, while donor strand complementation with the chaperone 
results in an atypical immunoglobulin fold, donor strand exchange between subunits produces 
a canonical variable-region immunoglobulin fold in the mature pilus. 

FimC-FimH chaperone-adhesin co-complex 

Further evidence illustrating donor strand complementation is provided by the 
resolution of the co-crystal structure of the FimC-FimH chaperone-adhesin co-complex from 
uropathogenic E. coli. See Choudhury, et al., Science 285: 1061 (1999). The FimC-FimH 
chaperone-adhesin co-complex structure also reveals a donor strand complementation 
mechanism that explains the basis of both chaperone function and pilus biogenesis. 

The FimH adhesin subunit is folded into two domains of the all-beta class, a NH 2 - 
terminal mannose-binding domain and a COOH-terminal pilin domain. A short extended 
linker (residues 157H - 159H) connects the two domains. The NH 2 -terminal mannose- 
binding domain comprises residues 1H - 156H, and the COOH-terminal pilin domain which 
is used to anchor the adhesin to the pilus comprises residues 160H - 279H (Figure 8 A). The 
pilin domain of FimH binds in the cleft of the chaperone (Figure 9B) with limited contact 
between FimH and the COOH-terminal domain of FimC. 

The lectin domain of FimH is an eleven-stranded elongated beta-barrel with a jelly 
roll-like topology (Figure 8B). The fold starts with a short beta hairpin that it not part of the 
jelly roll. The final (eleventh) strand of the domain is inserted between the third and tenth 
strands and thus breaks the jelly-roll topology. A pocket capable of accommodating a mono- 
mannose unit is located at the tip of the domain, distal from the connection to the pilin 
domain (Figure 9B). The bottom of the pocket is lined with asparagine, glutamine and 
aspartic acid residues in three loop regions which are typical carbohydrate binding side chains 
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(Figure 10A). These residues form hydrogen bonds with C-HEGA as described in Example 3 
herein. 

The pilin domain of FimH has the same immunoglobulin-like topology as the amino- 
terminal domain of FimC, except that the seventh strand of the fold is missing (Figure 8B). 

5 Two anti-parallel beta-sheets (strands A'BED' and D"CF) pack against each other to form a 
beta-barrel that is similar to, but distinct from, immunoglobulin barrels. As in the 
chaperones, strand switching occurs at the edges of the sheets. In the chaperones, the Al 
strand of the amino -terminal domain switches between the two sheets of the barrel. The first 
strand of the pilin domain exhibits a similar switch, but due to the lack of a seventh strand, 

10 the second half of the A strand is not involved in main chain hydrogen bonding within the 
domain. The D strand of the chaperones as well as of the FimH pilin domain also switches, 
but in the pilin domain the switch is an eight-residue loop instead of the cis-proline bulge 
found in the chaperones. The C-D loop and the D'-D" connection pack against each other 
and close the top of the barrel. The other side of the barrel, defined by the A and F edge 

15 strands, is open. Due to the absence of a seventh strand a deep scar is created on the surface 
of the domain. Residues that would be part of the hydrophobic core of an intact, seven 
stranded PapD-like domain instead line a deep hydrophobic crevice on the surface of the pilin 
domain (Figure 10B). 

As mentioned herein, the donor strand complementation mechanism refers to the 

20 chaperone donating its G, beta-strand to complete the fold of the pilin domain. The G, beta- 
strand of periplasmic chaperones such as FimC and PapD contains a conserved motif of 
solvent-exposed hydrophobic residues at positions 103, 105 and 107. In the chaperone- 
subunit co-complex, the Gj beta-strand containing these alternating hydrophobic residues are 
used to complete the unfinished hydrophobic core of pilus subunits such as FimH and PapD. 

25 Thus, in the FimC-FimH co-complex, these hydrophobic residues are used to complete the 
unfinished hydrophobic core of FimH which results from the missing seventh strand. 
Specifically, the seventh (G x ) strand from the NH 2 -terminal domain of the FimC chaperone 
complements the FimH pilin domain by being inserted between the second half of the A 
strand and the F strand of the domain (Figure 1 0C). Leu 103C and Leu 1 050 are deeply buried in 
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the crevice in the FimH pilin domain. Leu 103C of FimC contacts residues Ile 18,H , Val 223H , 
Leu 225H and Ile 272H of FimH. Leu 105C of FimC is in contact with Ile 181H , Leu 252H , Ile 272H , and 
Val 274H of FimH. lie 107 is closer to the FimH pilin domain surface but mades van der Waals 
contacts with residues Val 163H and Phe 276H . The final strand (F) of FimH forms a parallel beta- 
5 strand interaction with the G L beta-strand of FimC and has its COOH-terminal carboxylate 
group anchored in the crevice of the chaperone cleft through hydrogen bonding with the 
conserved residues Arg 8C and Lys 112C in FimC (Figure 9A). This interaction is critical for 
chaperone function. 

Furthermore, the two conserved motifs of FimH (the COOH-terminal F strand and an 
10 amino-terminal motif) participate in subunit-subunit interactions necessary for pilus 
assembly. See G.E. Soto et al., EMBOJ., 17: 6155 (1998). An alignment of the pilin 
sequences demonstrates that the amino-terminal motif of FimC was part of a 10-20 residue 
NH 2 -terminal extension that was missing in the FimH pilin domain (Figure 8A) and 
disordered in the PapD-PapK co-complex as discussed above. This region contains a highly 
15 conserved pattern of alternating hydrophobic residues (highlighted in Figure 8 A) similar to 
the donor Gj beta-strand of the chaperone. Applicants believe that the amino-terminal 
extension of the FimH subunit is structurally analogous to the donor G l beta-strand motif of 
the chaperone and thus, would fit into the pilin groove occupied by the donor G l beta-strand 
of the chaperone. 

20 However, the type 1 pilus is a right handed helix with about 3 subunits per turn, a 

diameter of approximately 70 A, a central pore of about 20-25 A, and a rise per subunit of 
about 8 A. Thus, in order to obtain this structure, the insertion of the NH 2 -terminal extension 
must be antiparallel to strand F in contrast to the parallel insertion observed for the Gj beta- 
strand of the chaperone. Insertion in a parallel orientation would lead to rosette-like 

25 structures. One edge of the pilin groove is lined by the COOH-terminal F strand and forms a 
critical part of the subunit tail. Thus, without being bound to any theory, Applicants believe 
that the amino-terminal extension represents the head of a subunit and during pilus 
biogenesis, the amino-terminal extension would displace the donor G l beta-strand of the 
chaperone to fit into the tail groove of a neighboring subunit to complete the pilin fold of its 
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neighbor in a donor strand complementation mechanism. 

Applicants constructed a model for the type 1 pilus using the FimH pilin domain as a 
model for FimA (Figure 1 1). Each subunit was aligned to have its cleft facing towards the 
center of the pilus so that the height from the top to the bottom of the domain along the helix 
axis was approximately 25 A. Applying a rotation of 1 1 5 degrees and a rise per subunit of 8 
A, a hollow helical cylinder is created. The outer diameter of this cylinder as measured across 
C a atoms is 70 A, and the inner diameter is 25 A. FimA subunits from different strains ofE. 
coli exhibit considerable allelic variation. The vast majority of the variable positions are on 
the outside surface of the pilus model described above (Figure 1 1) which would account for 
the antigenic variability of type 1 pili. 

The head-to-tail interaction between subunits in a pilus is reminiscent of 
oligomerization through three dimensional domain swapping in the sense that a part of the 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g., FimG, FimF and FimH in the pilus tip. Furthermore, because 
individual pilins promoters do not exist as stable monomers, there is no exchange of 
structural units between a monomeric and an oligomeric state. Instead, a different protein, the 
periplasmic chaperone, is needed to keep the monomeric subunits in solution by donating a 
unique part of its structure (the G t beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex and without being limited to 
any theory, it is believed that pilins are missing necessary steric information needed to fold 
into a native three dimensional structure. The information that is missing consists of the 
seventh edge strand of an immunoglobulin fold. This strand, which is necessary for folding, 
is donated to the hydrophobic core of the pilin by the periplasmic chaperone in a donor strand 
complementation mechanism. 

Applicants further utilized the co-crystal structure of the FimC-FimH chaperone- 
adhesin co-complex to identify the anino-terminal mannose-binding domain of FimH, an 
essential component required for pilus adhesion to host tissues. As discussed above, the 
bottom of this mannose-binding domain is lined with asparagine, glutamine and aspartic acid 
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residues and those skilled in the art would be able to use molecular modeling techniques and 
other existing protocols to design and synthesize antibacterial compounds. Such compounds 
would compete with mannose for binding to the FimH adhesin thereby preventing or 
inhibiting pilus adhesion to host epithelium. 

5 Thus, applicants utilized the discovery of this molecular mechanism of protein 

binding to identify an essential part of a defined binding site responsible for pilus assembly 
and adhesion. Further, applicants have utilized this structure to design and fabricate methods 
and compounds to compete with the chaperone for binding to the exposed binding site of the 
pilus subunit thereby inhibiting pilus assembly and reducing the pathogenicity of piliated 

10 Gram-negative bacterium. Such a compound is useful in treating bacterial diseases or in 
preventing costly biofilm formation in medical, industrial and various other settings. 

Peptide compounds 

Thus, the present invention is directed to compounds which mimic the capability of a 
15 periplasmic chaperone or of a pilus subunit to bind to the groove of a pilus subunit, thereby 
preventing or inhibiting pilus biogenesis by interfering with the normal function of these 
biological components. Specifically, applicants have shown that prevention or inhibition of 
the binding between pilus subunits and between pilus subunits and periplasmic chaperones 
can be accomplished in a number of ways. 
20 In a preferred embodiment of the invention, the compounds are peptides or peptide 

analogs that are capable of disrupting the assembly of pilus subunits and/or binding the cleft 
of a pilus subunit that is bound by the Gj beta-strand of another pilus subunit in an assembled 
pili structure and comprise a core sequence of residues preferably derived from a conserved 
N-terminal region of a pilus subunit. As will be apparent from alignments of the conserved 
25 N-terminal regions of the various pilus subunits, such peptides and peptide analogs will 

typically comprise at least two alternating hydrophobic amino acids. The core sequence of 
such peptides and peptide analogs may be derived from the amino-terminal sequence of any 
of a number of pilus subunits, including but not limited to, PapA, PrsA, FimA, AfaA, FocA, 
HifA, HafA, Fim2, Fim3, MrpA, PmfA, LpfA, PefA, ArfA, PapK, PrsK, PapH, PrsH, PapE, 
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PrsE, MrpB, SfaG, SfaS, FocG, FocF, PapF, PrsF, MprF, MrpE, F17A, FanC, FaeA, MrkA 
and RalC. Typically, the core sequence is composed of about 3 to about 12 residues, 
preferably 5 to 9, most preferably 7 residues. The core sequence may correspond identically 
to the sequence of a pilus subunit, or it may include one or more substitutions, preferably 

5 conservative substitutions, and/or insertions and/or deletions. 

Moreover, the core sequence may be flanked at either of both of its N- and/or C- 
termini by residues of random sequence (i.e., sequences that do not necessarily correspond to 
the pilus subunit from which the core sequence is derived). When included, such flanking 
residues should not significantly alter the ability of the core sequence to disrupt subunit 

10 assembly. Thus, typically the compounds of the invention will include fewer than 5 flanking 
residues at each terminus, preferably fewer than 3 flanking residues, and most preferably no 
flanking residues. 

Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 
example, the peptide or peptide analog may include a core sequence derived from PapA 
15 flanked at one or both termini with sequences derived from FimA. Alternatively, the peptide 
or peptide analog may include a core sequence of, for example 10 residues, some of which 
are, for example, derived from PapA and the rest of which are, for example, derived from 
FimA. 

In one illustrative embodiment, the compounds are 10 to 20 residue peptide and/or 
20 peptide analogs comprising formula (I): 

(I) x 1 -x 2 -x 3 -x 4 -x 5 -x 6 -x 7 -x 8 -x 9 -x 10 

or a pharmaceutically-acceptable salt thereof, wherein: 

Xj is any amino acid residue, preferably other than a basic residue; 
25 X 2 is any amino acid residue, preferably other than a aliphatic residue; 

X 3 is a hydrophobic residue, preferably an aliphatic residue or a hydroxyl- 
substituted aliphatic residue; 

X 4 is any amino acid residue, preferably other than an acidic residue; 
X 5 is a hydrophobic residue or Gly; 
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X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is a hydrophobic residue, preferably Gly, an amide-substituted polar residue 
or an aliphatic residue, and most preferably Gly; 

X g is any amino acid residue, preferably other than an aliphatic residue; 
X 9 is an aliphatic residue; and 

Xj 0 is any amino acid residue, preferably a hydrophobic residue, more 
preferably an aliphatic residue or a polar residue. 

In the compounds comprising formula (I), the symbol "-" between residues X„ 
generally designates a backbone constitutive linking function. Thus, when the compounds 
are peptides, the symbol "-" represents a peptide or amide linkage (-C(O)NH-). It is to be 
understood, however, that formula (I) includes peptide analogs in which one or more amide 
linkages is optionally replaced with a linkage other than amide linkage, preferably a 
substituted amide or an isostere of amide linkage. Thus, while the various X„ residues within 
formula (I) may conveniently be described in terms of "amino acids" or "residue," those 
having skill in the art will recognize that in embodiments having non-amide linkages, the 
term "amino acid" or "residue" as used herein refers to other bifunctional moieties bearing 
side-chain groups similar in structure to the side chains of the amino acids. 

Substituted amide linkages generally include, but are not limited to, groups of the 
formula -C(0)N(R)-, where R is (C r C 6 ) alkyl, substituted (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, 
substituted (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) alkynyl, substituted (C 2 -C 6 ) alkynyl, (C 5 -C 20 ) aryl, 
substituted (C 5 -C 20 ) aryl, (C 6 -C 26 ) arylalkyl, substituted (C 6 -C 26 ) arylalkyl, 5-20 membered 
heteroaryl, substituted 5-20 membered heteroaryl, 6-26 membered heteroarylalkyl and 
substituted 6-26 membered heteroarylalkyl. 

Isosteres of amide linkages generally include, but are not limited to, -CH 2 NH-, 
-CH 2 S-, -CH 2 CH 2 -, -CH=CH- (cis and trans), -C(0)CH 2 -, -CH(OH)CH 2 - and -CH 2 SO-. 
Compounds having such non-amide linkages and methods for preparing such compounds are 
well-known in the art (see, e^, Spatola, March 1983, Vega Data Vol. 1, Issue 3; Spatola, 
1983, "Peptide Backbone Modifications" In: Chemistry and Biochemistry of Amino Acids 
Peptides and Proteins, Weinstein, ed., Marcel Dekker, New York, p. 267 (general review); 
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Morley, 1980, Trends Pharm. Sci. 1:463-468; Hudson et al., 1979, Int. J. Prot. Res. 14:177- 
185 (-CH 2 NH-, -CH 2 CH 2 -); Spatola et al., 1986, Life Sci. 38:1243-1249 (-CH 2 -S); Hann, 
1982, J. Chem. Soc. Perkin Trans. I. 1:307-314 (-CH=CH-, cis and trans); Almquist et al, 
1980, J. Med. Chem. 23:1392-1398 (-COCH 2 -); Jennings- White et al, Tetrahedron. Lett. 
5 23 :2533 (-COCH 2 -); European Patent Application EP 45665 (1982) CA 97:39405 

(-CH(OH)CH 2 -); Holladay et al, 1983, Tetrahedron Lett. 24:4401-4404 (-C(OH)CH 2 -); and 
Hruby, 1982, Life Sci. 31:189-199 (-CH 2 -S-). 

Additionally, one or more amide linkages can be replaced with peptidomimetic or 
amide mimetic moieties which do not significantly interfere with the structure or activity of 
10 the peptides. Suitable amide mimetic moieties are described, for example, in Olson et al., 
1993, J. Med. Chem. 36:3039-3049. 

Compounds comprising formula (I) that are peptide analogs may provide significant 
therapeutic advantages, as their non-peptide interlinkages may confer the compound with 
enhanced stability towards proteases and/or peptidases, thereby conferring the compounds 
15 with increases in vivo stability compared to a corresponding peptide. 

The various residues X 1 through X 10 may be selected from amongst the genetically 
encoded amino acids, as well as from genetically non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains activity. 
Compounds including D-amino acids may have enhanced in vivo stability. Preferably, all of 
20 residues X, through X 10 are in the L-configuration. 

The peptides and peptide analogs of formula (I) may optionally include, in addition to 
the sequence defined by residues X t through X, 0 , a 1 to 5 residue peptide or peptide analog at 
either or both termini. Peptide analogs typically contain at least one modified interlmkage, 
such as a substituted amide or an isostere of an amide, as described above. Such additional 
25 peptides or peptide analogs may have an amino acid sequence derived from a pilus subunit or, 
alternatively, their sequences may be completely random. Compounds including such 
random sequences may be tested for biological activity in the various assays and methods 
described in a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
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genetically encoded or non-encoded, and may be in either the D- or L-configuration. In one 
embodiment, when the sequence defined by formula (I) is a peptide, one or both termini are 
"capped" with 1 to 5 residue peptides composed wholly of D-amino acids that serve to protect 
the core sequence from degradation in vivo by proteases and/or peptidases. 

5 Also included within the scope of the present invention are "blocked" forms of the 

peptides and peptide analogs including formula (I), i.e., 10 to 20 peptides and/or peptide 
analogs in which the N- and/or C-terminus is blocked with a moiety capable of reacting with 
the N-terminal -NH 2 or C-terminal -C(0)OH. Such blocked compounds are typcially 
N-terminal acylated and/or C-terminal amidated or esterified. Typical N-terminal blocking 

10 groups include R/CXO)-, where R 1 is hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl, (C 5 -C 20 ) aryl, (C 6 -C 26 ) arylalkyl, 5-20 membered heteroaryl or 6-26 membered 
heteroarylalkyl. Preferred N-terminal blocking groups include acetyl, formyl and dansyl. 
Typical C-terminal blocking groups include -C(0)NR 1 R 1 and -C(0)OK\ where each R 1 is 
independently as defined as above. Preferred C-terminal blocking groups include those in 

15 which each R 1 is independently (C,-C 6 ) alkyl, preferably methyl, ethyl, propyl or isopropyl 
Preferred amongst the 10 to 20 residue peptides and/or peptide analogs comprising 
formula (I) are those compounds having one or more or the following characteristics: 
X 3 is an aliphatic residue or T; 
X 5 is an aliphatic residue, F or G; and/or 

20 X 7 is G, H or A. 

Particularly preferred are the 10-residue peptides described in Table 2, below. 
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Table 2: SUBUNIT N-TERMINAL-MOTIF-DERTVED PEPTIDES 



AMINO ACID SEQUENCE 


PILUS SUBUNIT 


GKVTFNGTVV (SEQ ID NO: 2) 


PapA, PrsA 


GTVHFKGEVV (SEQ ID NO: 3) 


FimA, SfaA, FocA 


GKVTFFGKVV (SEQ ID NO: 4) 


HifA, HafA 


GTIVITGTIT (SEQ ID NO: 5) 


Fim2 


GTIVITGSIS (SEQ ID NO: 6) 


Fim3 


GTVKFVGSII (SEQ ID NO: 7) 


MrpA 


GEIQLKGEIV (SEQ ID NO: 8) 


PmfA 


GTIKFTGEIV (SEQ ID NO: 9) 


LpfA 


NEVTFLGSVS (SEQ ID NO: 10) 


Pe£A. 


GTINFEGSVV (SEQ ID NO: 11) 


AtfA 


SDVAFRGNLL (SEQ ID NO: 12) 


PapK, PrsK 


GRAAFHGEVV (SEQ ID NO: 13) 


PapH 


GRATFHGEVV (SEQ ID NO: 14) 


PrsH 


DNLTFRGKLI (SEQ ID NO: 15) 


PapE 


DNLTFKGKLI (SEQ ID NO: 16) 


PrsE 


GWLNLQGTIL (SEQ ID NO: 17) 


MrpB 


SVVNITGNVQ (SEQ ID NO: 18) 


SfaG 


TTITVTGNVL (SEQ ID NO: 19) 


SfaS 


TTITVTGRVL (SEQ ID NO: 20) 


FocG 


CMLAGSNFVT (SEQ ID NO: 21) 


FocF 


VQINIRGNVY (SEQ ID NO: 22) 


PapF, PrsF 


PNLKLFGTLL (SEQ ID NO: 23) 


MrpF 


VYINITGNVI (SEQ ID NO: 24) 


MrpE 
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GKITFNGKVV (SEQ ID NO: 25) 



F17A 



GTINFNGKIT (SEQ ID NO: 26) 



FanC 



QKTIFSADVV (SEQ ID NO: 27) 



FaeA 



GQVNFFGKVT (SEQ ID NO: 28) 



MrkA 



QRTIITADW (SEQ ID NO: 29) 



RalC 



In a preferred embodiment of the invention, the compounds are peptides or peptide 
analogs that mimic the binding activity of the G, beta-strand of a chaperone and that exhibit 
antibacterial activity against a Gram-negative bacterium. The core sequence of such peptides 

5 and peptide analogs may be derived from the G! beta-strand of any of a number of 

chaperones, including but not limited to, PapD, MrpD, FanE, SfaE, FaeE, MrkB, HifB, F17D, 
FimC, FimB, PefD, EcpD, ClpE, YehC, PmfF, FocC, LpfB, SefB, CaFlM, CS3-1, CsaB, 
MyfB, AggD, CssC, NfaA and AfaB. Typically, the core sequence is composed of about 3 to 
about 12 residues, preferably from 4 to 9 residues and most preferably 7 residues. The core 

10 sequence may correspond identically to the G } beta-strand sequence of a chaperone, or it may 
include one or more substitutions, preferably conservative substitutions, and/or insertions 
and/or deletions. 

Moreover, the core sequence may be flanked at either of both of its N- and/or C- 
termini by residues of random sequence (i.e., sequences that do not necessarily correspond to 

15 the G, beta-strand from which the core sequence is derived). When included, such flanking 
residues should not significantly alter the ability of the core sequence to mimic the binding 
activity of the G, beta-strand of a chaperone. Thus, typically the compounds of the invention 
will include fewer than 5 flanking residues at each terminus, preferably fewer than 3 flanking 
residues and most preferably no flanking residues. 

20 Further, the peptides and/or peptide analogs may comprise hybrid sequences. For 

example, the peptide or peptide analog may include a core sequence derived from the Gj beta- 
strand of a PapD chaperone flanked at one or both termini with sequences derived from an 
MrpD chaperone. Alternatively, the peptide or peptide analog may include a core sequence 
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of, for example 7 residues, some of which are, for example, derived from a PapD chaperone 
and the rest of which are derived from, for example a FanE chaperone. 

In one illustrative embodiment, the compounds are 7 to 17 residue peptide and/or 
peptide analogs comprising formula (II): 

(II) x n -x 12 -x 13 -x 14 -x 15 -x 16 -x 17 

or apharmaceutically-acceptable salt thereof, wherein: 

X n is any amino acid residue, preferably other than a basic residue; 
X 12 is any amino acid residue; 

X 13 is a hydrophobic residue, preferably an aliphatic residue or an apolar 
residue, wherein the apolar residue is preferably M; 

X 14 is any amino acid residue, preferably other than an aromatic residue; 
X 15 is a hydrophobic residue, preferably an aliphatic residue; 
X 16 is any amino acid residue, preferably an aliphatic residue or a hydroxyl- 
substituted aliphatic residue; and 

X 17 is hydrophobic residue or a hydroxyl-substituted aliphatic residue, 
preferably an aliphatic residue, F, M or a hydroxyl-substituted aliphatic residue. 

In the compounds comprising (II), the symbol "-" between residues X„ is as 
previously defined for formula (I). 

The various residues X u through X 1V may be selected from amongst the genetically 
encoded amino acids, as well as from genetically non-encoded amino acids. Moreover, the 
residues may be in either the D- or L- configuration, as long as the compound retains activity. 
Compounds including D-amino acids may have enhanced in vivo stability. Preferably, all of 
residues X u through X 1V are in the L-configuration. 

The peptides and peptide analogs of formula (II) may optionally include, in addition 
to the sequence defined by residues X„ through X IV , a 1 to 5 residue peptide or peptide analog 
at either or both termini. Peptide analogs typically contain at least one modified interlinkage, 
such as a substituted amide or an isostere of an amide, as described above. Such additional 
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peptides or peptide analogs may have an amino acid sequence derived from the Gj beta-strand 
of a chaperone or, alternatively, their sequences may be completely random. Compounds 
including such random sequences may be tested for biological activity in the various assays 
and methods described in a later section. 

The residues which comprise such additional peptides or peptide analogs may be 
genetically encoded or non-encoded, and may be in either the D- or L-configuration. In one 
convenient embodiment, when the sequence defined by formula (II) is a peptide, one or both 
termini are "capped" with 1 to 5 residue peptides composed wholly of D-amino acids that 
serve to protect the core sequence from degradation in vivo by proteases and/or peptidases. 

Also included within the scope of the present invention are "blocked" forms of the 
peptides and peptide analogs including formula (II), as previously described in connection 
with compounds comprising formula (I). 

Preferred amongst the 7 to 17 residue peptides and/or peptide analogs comprising 
formula (II) are those compounds having one or more or the following characteristics: 
X 13 is an aliphatic residue or M; 
X 15 is an aliphatic residue, F or M; and/or 
X 17 is an aliphatic residue, F, M or T. 

Particularly preferred are the 7-residue peptides described in Table 3, below. 
Table 3: CHAPERONE G, BETA-STRAND-DERIVED PEPTIDES 



AMINO ACID SEQUENCE 


CHAPERONE 


NVLQIAL (SEQ ID NO: 1) 


PapD, MrpD 


GSLSLAI (SEQ ID NO: 30) 


FanE 


NYLQFAI (SEQ ID NO: 31) 


SfaE 


SGI AVAL (SEQ ID NO: 32) 


FaeE 


NILQLAI (SEQ ID NO: 33) 


MrkB 


SFMQIAI (SEQ ID NO: 34) 


HifB 


NYLQFAV (SEQ ID NO: 35) 


F17D 
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NTLQLAI (SEQ ID NO: 36) 


FimC 


GVLQLTI (SEQ ID NO: 37) 


FimB 


NVLAVAV (SEQ ID NO: 38) 


PefD 


SLLQLAF (SEQ ID NO: 39) 


EcpD 


SGIAVAV (SEQ ID NO: 40) 


ClpE 


NALKFAM (SEQ ID NO: 41) 


YehC 


NVLQMAM (SEQ ID NO: 42) 


PmfD 


NYLQFAI (SEQ ID NO: 43) 


FocC 


NVLQIAV (SEQ ID NO: 44) 


LpfB 


LNVNVVT (SEQ ID NO: 45) 


SefB 


VFVQFAI (SEQ ID NO: 46) 


CaflM 


MKLNVSI (SEQ ID NO: 47) 


CS3-1 


MDIQMSI (SEQ ID NO: 48) 


PsaB 


LNILLSV (SEQ ID NO: 49) 


MyfB 


MNIQVSV (SEQ ID NO: 50) 


AggD 


DSINISI (SEQ ID NO: 51) 


CssC 


LNVQLSV (SEQ ID NO: 52) 


NfaA, AfaB 



Deletions of residues from either terminus of the peptides and peptide analogs of 
formula (I) or (II) are also contemplated to be within the scope of the invention. Such 
deletions consist of the removal of one or more amino acids of the peptide sequence, with the 
lower limit length of the resulting peptide sequence being 3 to 7 amino acids, preferably 3 to 
5 amino acids. Such deletions may involve a single contiguous or greater than one discrete 
portion of the peptide sequences. One or more such deletions may be introduced into the 
sequence, as long as such deletions result in peptides which may still bind in whole, or in 
part, to a pilus subunit and consequentially prevent or inhibit pilus biogenesis. 
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It will be appreciated that by virtue of the present invention, the above-described 
polypeptides can be synthesized using conventional synthesis procedures commonly used by 
one skilled in the art. For example, the polypeptides can be chemically synthesized using an 
automated peptide synthesizer (such as one manufactured by Pharmacia LKB Biotechnology 

5 Co., LKB Biolynk 4170 or Milligen, Model 9050 (Milligen, Millford, MA)) following the 
method of Sheppard, et al., Journal of Chemical Society Perkin I, p. 538 (1981). In this 
procedure, N,N-dicyclohexylcarbodiimide is added to amino acids whose amine functional 
groups are protected by 9-flourenylmethoxycarbonyl (Fmoc) groups and anhydrides of the 
desired amino acids are produced. These Fmoc-amino acid anhydrides can then be used for 

10 peptide synthesis. A Fmoc-amino acid anhydride corresponding to the C-terminal amino acid 
residue is fixed to Ultrosyn A resin through the carboxyl group using dimethylaminopyridine 
as a catalyst. Next, the resin is washed with dimethylformamide containing piperidine, and 
the protecting group of the amino functional group of the C-terminal acid is removed. The 
next amino acid corresponding to the desired peptide is coupled to the C-terminal amino acid. 

15 The deprotecting process is then repeated. Successive desired amino acids are fixed in the 
same manner until the peptide chain of the desired sequence is formed. The protective groups 
other than the acetoamidomethyl are then removed and the peptide is released with solvent. 

Alternatively, the polypeptides can be synthesized by using nucleic acid molecules 
which encode the peptides of this invention in an appropriate expression vector which include 

20 the encoding nucleotide sequences. Such DNA molecules may be readily prepared using an 
automated DNA sequencer and the well-known codon-amino acid relationship of the genetic 
code. Such a DNA molecule also may be obtained as genomic DNA or as cDNA using 
oligonucleotide probes and conventional hybridization methodologies. Such DNA molecules 
may be incorporated into expression vectors, including plasmids, which are adapted for the 

25 expression of the DNA and production of the polypeptide in a suitable host such as 
bacterium, e.g., Escherichia coli, yeast cell or mammalian cell. 

It is known that certain modifications can be made without completely abolishing the 
polypeptide's antibacterial activity. Modifications include the removal and addition of amino 
acids. Polypeptides containing other modifications can be synthesized by one skilled in the 
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art and compounds comprising such polypeptides may be tested for biological activity in the 
various assays and methods described in a later section. Thus, the effectiveness of the 
polypeptides can be modulated through various changes in the amino acid sequence or 
structure. 

Further, it should be understood that the mimic may be modified using methods 
known in the art to improve binding, specificity, solubility, safety, or efficacy. A necessary 
characteristic of these preferred compounds is the capability to interact with at least one pilus 
subunit during transport of these pilus subunits through periplasmic space and/or during the 
process of assembly of the intact pilus, in such a manner that pilus biogenesis is prevented or 
inhibited. The compound can be any compound, preferably a peptide, which has one of the 
above effects on pilus subunits and thereby on the assembly of an intact pilus. 

Morever, the present invention is directed to a compound which will mimic the 
capability of mannose to bind to the mannose binding site at the tip of the FimH adhesin, 
thereby preventing or inhibiting the ability of the pilus to adhere and infect host tissues. As 
discussed above, the bottom of this mannose-binding domain of FimH is lined with 
asparagine, glutamine and aspartic acid residues and those skilled in the art would be able to 
use molecular modeling techniques and other existing protocol to design and synthesize 
antibacterial compounds. Such compounds would compete with mannose for binding to the 
FimH adhesin thereby preventing or inhibiting pilus adhesion to host epithelium. As such, 
these compounds may be used in methods of preventing or inhibiting pili adhesion to a host 
tissue. 

The present invention also provides a method for inhibiting bacterial colonization by a 
Gram-negative organism. This method involves administration of a compound which will 
interfere with the binding of a chaperone to a pilus subunit, thereby preventing the assembly 
of an intact pilus structure. In a preferred embodiment of the invention, a method of 
preventing or inhibiting the assembly of pilus subunits is provided by interfering with, in the 
PapK pilus subunit, a binding site which is normally involved in the binding to pilus subunits 
during transport of these pilus subunits through the periplasmic space and/or during the 
process of pilus assembly. In another embodiment of the invention, a method of preventing 
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or inhibiting the assembly of pilus subunits is provided by interfering with, in the FimC pilus 
subunit, a binding site which is normally involved in the binding to pilus subunits during 
transport of these pilus subunits through the periplasmic space and/or during the process of 
pilus assembly. 

5 

Antibacterial compounds and pharmaceutical compositions 

In another preferred embodiment of the invention, a method of preventing or 

inhibiting the assembly of pilus subunits is provided by administering an antibacterial 

compound which will mimic the capability of a periplasmic chaperone or a pilus subunit to 
10 bind to a pilus subunit. Also provided is a method of preventing or inhibiting the adhesion of 

a pilus to a host tissue by administering an antibacterial compound which will bind to a pilus 

mannose-binding domain. 

The antibacterial compositions of the present invention may be utilized to inhibit pili 

assembly and/or pili adhesion by providing an effective amount of such compositions to a 
15 patient. 

For use as antimicrobials for treatment of animal subjects, the compounds of the 
invention can be formulated as pharmaceutical or veterinary compositions. Depending on the 
subject to be treated, the mode of administration, and the type of treatment desired, e.g., 
prevention, prophylaxis, therapy; the compounds are formulated in ways consonant with 

20 these parameters. A summary of such techniques is found in Remington's Pharmaceutical 
Sciences, latest edition, Mack Publishing Co., Easton, PA. 

For administration to animal or human subjects, the dosage of the compounds of the 
invention is typically 0.1-100mg/kg. However, dosage levels are highly dependent on the 
nature of the infection, the condition of the patient, the judgment of the practitioner, and the 

25 frequency and mode of administration. The dosage of such a substance is expected to be the 
dosage which is normally employed when administering antibacterial drugs to patients or 
animals, i.e. 1 ]ag - 1000 p.g per kilogram of body weight per day. The dosage will depend 
partly on the route of administration of the substance. If the oral route is employed, the 
absorption of the substance will be an important factor. A low absorption will have the effect 
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that in the gastro-intestinal tract higher concentrations, and thus higher dosages, will be 
necessary. Also, the dosage of such a substance when treating infections of the central 
nervous system (CNS) will be dependent on the permeability of the blood-brain barrier for 
the substance. As is well-known in the treatment of bacterial meningitis with penicillin, very 

5 high dosages are necessary in order to obtain effective concentrations in the CNS. 

It will be understood that the appropriate dosage of the substance should suitably be 
assessed by performing animal model tests, wherein the effective dose level (e.g. ED 50 ) and 
the toxic dose level (e.g. TD 50 ) as well as the lethal dose level (e.g. LD 50 or LD 10 ) are 
established in suitable and acceptable animal models. Further, if a substance has proven 

10 efficient in such animal tests, controlled clinical trials should be performed. Needless to state 
such clinical trials should be performed according to the standards of Good Clinical Practice. 

In general, for use in treatment, the compounds of the invention may be used alone or 
in combination with other antibiotics such as erythromycin, tetracycline, macrolides, for 
example azithromycin and the cephalosporins. Depending on the mode of administration, the 

1 5 compounds will be formulated into suitable compositions to permit facile delivery to the 
affected areas. 

Formulations may be prepared in a manner suitable for systemic administration or 
topical or local administration. Systemic formulations include those designed for injection 
(e.g., intramuscular, intravenous or subcutaneous injection) or may be prepared for 

20 transdermal, transmucosal, or oral administration. The formulation will generally include a 
diluent as well as, in some cases, adjuvants, buffers, preservatives and the like. 

For oral administration, the compounds can be administered also in liposomal 
compositions or as microemulsions. Suitable forms include syrups, capsules, tablets, as is 
understood in the art. For injection, formulations can be prepared in conventional forms as 

25 liquid solutions or suspensions or as solid forms suitable for solution or suspension in liquid 
prior to injection or as emulsions. Suitable excipients include, for example, water, saline, 
dextrose, glycerol and the like. Such compositions may also contain amounts of nontoxic 
auxiliary substances such as wetting or emulsifying agents, pH buffering agents and the like, 
such as, for example, sodium acetate, sorbitan monolaurate, and so forth. 
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It will be understood that the above-described methods comprising administration of 
substances in treating and/or preventing diseases are dependent on the identification or de 
novo design of substances which are capable of exerting effects which will lead to prevention 
or inhibition of the interaction between pilus subunits and periplasmic molecular chaperones. 
It is further important that these substances will have a high chance of being therapeutically 
active. 

Thus clinical experimental trials and animal studies can be undertaken to demonstrate 
the therapeutic efficacy of peptide mimics and analogues for preventing or inhibiting pilus 
assembly. The efficacy of such compounds can be shown using methods known in the art, 
including pilus inhibition and binding assays, specifically ELISA or hemagglutination. 

The antibacterial compositions of the present invention also have a variety of 
industrial uses, well known to those skilled in such arts, relating to their antibacterial 
properties. In general, these uses are carried out by bringing a biocidal or bacterial inhibitory 
amount of the antibacterial compositions of the present invention into contact with a surface, 
environment or biozone containing Gram-negative bacteria so that the composition is able to 
interact with and thereby interfere with the biological function of such bacteria. For example, 
such antibacterial compositions can be used to prevent or inhibit biofilm formation caused by 
Gram-negative bacteria and to inhibit bacterial colonization by a Gram-negative organism. 
Compositions may be formulated as sprays, solutions, pellets, powders and in other forms of 
administration well known to those skilled in such arts. 

Crystalline Pa pD-PapK Chaperon e-Subunit Co-Complex and 
FimH-FimC Chaperone-Adbesin C o-Complex 

The present invention provides, for the first time, the high-resolution three- 
dimensional structure and atomic structure coordinates of the crystalline co-complexes of the 
PapD-PapK chaperone-subunit as determined by X-ray crystallography. Also provided for 
usage in the methods of the present invention is the high resolution three dimensional 
structures and atomic structure coordinates for the crystalline co-complexes of the FimC- 
FimH chaperone-adhesin as determined by X-ray crytallography. The specific methods used 
to obtain the structure coordinates are provided in the examples, infra. The atomic structure 
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coordinates of crystalline PapD-PapK co-complex, obtained from the co-crystal to 2.4 A 
resolution, are listed in Table 4. The atomic structure coordinates of crystalline FimC-FimH 
co-complex, obtained from the co-crystal to 2.5 A resolution, are listed in Table 5. 

Additional antibacterial compounds can be modeled and synthesized utilizing the 
5 atomic coordinates obtained from the resolution of the co-crystal structure of the PapD-PapK 
chaperone-subunit co-complex and the FimC-FimH chaperone-adhesin co-complex. For 
example, as discussed herein, applicants utilized the co-crystal structure of the FimC-FimH 
chaperone-adhesion co-complex to identify the NH 2 .terminal mannose-binding domain of 
FimH, an essential component required for pilus adhesion to host tissues. As the COOH- 
1 0 terminus of pilus subunits in many tissue-adhering bacteria have been found to be highly 

conserved, it is believed that the antibacterial compounds of the present invention are capable 
of interacting with the majority of pilus subunits and thus are useful in the treatment of 
various diseases caused by piliated bacteria. 

Thus, the invention encompasses a co-crystal of a pilus chaperone-subunit co- 
15 complex comprising an amino acid sequence of a G; beta-strand of a periplasmic chaperone 
and an amino acid sequence from the amino-terminal sequence of a pilus subunit. Preferably, 
the amino acid sequence of a G t beta-strand would be the N101 to L107 amino acid region of 
a Gj beta-strand of a pilus chaperone, and even more preferably, the amino acid sequence of a 
Gi beta-strand would be the N101 to LI 07 amino acid region of a Gj beta-strand of a PapD 
20 chaperone and most preferably, the amino acid sequence of the G t beta-strand would be SEQ 
ID NO: 1. Preferably, the amino acid sequence of the amino-terminal sequence would be 
from the N-terminal sequence of a PapK subunit, and more preferably, the amino acid 
sequence of the amino-terminal sequence would be the amino acid sequence of SEQ ID NO: 
12. In a preferred embodiment, the co-crystal is a crystalline form of the polypeptides 
25 corresponding to the PapD-PapK chaperone-subunit co-complex. In a preferred embodiment 
of the invention, the co-crystal effectively diffracts X-rays for the determination of the atomic 
coordinates of the pilus chaperone-subunit co-complex to a resolution of from about 3 
angstroms to about 2.4 angstroms or greater. 

Preferably, co-crystals of the invention comprise crystallized polypeptides 
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corresponding to the wild-type PapD-PapK chaperone-subunit co-complex. The co-crystals 
of the invention include native co-crystals in which the crystallized PapD-PapK chaperone- 
subunit co-complex is substantially pure and heavy-atom atom derivative co-crystals in which 
the crystallized PapD-PapK chaperone-subunit co-complex is in association with one or more 

5 heavy-metal atoms. The co-crystals from which the atomic structure coordinates of the 

crystalline co-complexes of the present invention may be obtained include native co-crystals 
and heavy-atom derivative co-crystals. Native co-crystals generally comprise substantially 
pure polypeptides corresponding to the PapD-PapK co-complex in crystalline form. 

It is to be understood that the crystalline PapD-PapK co-complex from which the 

10 atomic structure coordinates of the invention can be obtained is not limited to the wild-type 
PapD-PapK co-complex. Indeed, the co-crystals may comprise mutants of the wild-type co- 
complex. Mutants of wild-type co-complexes are obtained by replacing at least one amino 
acid residue in the sequences of one or both the polypeptides comprising the wild-type co- 
complex with a different amino acid residue, or by adding or deleting one or more amino acid 

15 residues within the wild-type sequences and/or at the N- and/or C-terminus of one of both of 
the polypeptides comprising the wild-type co-complex. Preferably, such mutants will 
crystallize under crystallization conditions that are substantially similar to those used to 
crystallize the wild-type co-complex. 

The types of mutants contemplated by this invention include conservative mutants, 

20 non-conservative mutants, deletion mutants, truncated mutants, extended mutants, methionine 
mutants, selenomethionine mutants, cysteine mutants and selenocysteine mutants. A mutant 
may have, but need not have, pilus subunit binding activity. Preferably, a mutant displays 
biological activity that is substantially similar to that of the wild-type polypeptide. 
Methionine, selenomethione, cysteine, and selenocysteine mutants are particularly useful for 

25 producing heavy-atom derivative co-crystals, as described in detail, below. 

It will be recognized by one of skill in the art that the types of mutants contemplated 
herein are not mutually exclusive; that is, for example, a polypeptide having a conservative 
mutation in one amino acid may in addition have a truncation of residues at the N- terminus, 
and several Leu or He -> Met mutations. 
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Sequence alignments of polypeptides in a protein family or of homologous 
polypeptide domains can be used to identify potential amino acid residues in the polypeptide 
sequence that are candidates for mutation. Identifying mutations that do not significantly 
interfere with the three-dimensional structure of the PapD-PapK co-complex and the FimC- 

5 FimH co-complex and/or that do not deleteriously affect, and that may even enhance, the 
activity of the co-complex will depend, in part, on the region where the mutation occurs. 

Conservative amino acid substitutions are well-known in the art, and include 
substitutions made on the basis of a similarity in polarity, charge, solubility, hydrophobicity 
and/or the hydrophilicity of the amino acid residues involved. Typical conservative 

10 substitutions are those in which the amino acid is substituted with a different amino acid that 
is a member of the same class or category, as those classes are defined herein. Thus, typical 
conservative substitutions include aromatic to aromatic, apolar to apolar, aliphatic to 
aliphatic, acidic to acidic, basic to basic, polar to polar, etc. Other conservative amino acid 
substitutions are well known in the art. It will be recognized by those of skill in the art that 

15 generally, a total of about 20% or fewer, typically about 10% or fewer, most usually about 
5% or fewer, of the amino acids in the wild-type polypeptide sequence can be conservatively 
substituted with other amino acids without deleteriously affecting the biological activity 
and/or three-dimensional structure of the molecule, provided that such substitutions do not 
involve residues that are critical for activity, as discussed above. 

20 The heavy-atom derivative co-crystals from which the atomic structure coordinates of 

the invention are obtained generally comprise a crystalline co-complex in association with 
one or more heavy metal atoms. The polypeptides may correspond to a wild-type or a mutant 
PapD-PapK co-complex or FimC-FimH co-complex, which may optionally be further 
associated with one or more molecules. There are two types of heavy-atom derivatives of 

25 polypeptides: heavy-atom derivatives resulting from exposure of the proteins to a heavy metal 
in solution, wherein co-crystals are grown in medium comprising the heavy metal, or in 
crystalline form, wherein the heavy metal diffuses into the co-crystal, and heavy-atom 
derivatives wherein at least one of the polypeptides in the co-complex comprises heavy-atom 
containing amino acids, e.g., selenomethionine and/or selenocysteine mutants. 



49 WSHU 2005.1 

PATENT 

In practice, heavy-atom derivatives of the first type can be formed by soaking a native 
co-crystal in a solution comprising heavy metal atom salts, or organometallic compounds, 
e.g., lead chloride, gold thiomalate, thimerosal, uranyl acetate, platinum tetrachloride, 
osmium tetraoxide, zinc sulfate, and cobalt hexamine, which can diffuse through the co- 
5 crystal and bind to the crystalline polypeptides. 

Heavy-atom derivatives of this type can also be formed by adding to a crystallization 
solution comprising the polypeptides to be co-crystallized an amount of a heavy metal atom 
salt, which may associate with at least one of the protein and be incorporated into the co- 
crystal. The location(s) of the bound heavy metal atom(s) can be determined by X-ray 
10 diffraction analysis of the co-crystal. This information, in turn, is used to generate the phase 
information needed to construct the three-dimensional structure of the proteins in the co- 
complex. 

The native and/or heavy-atom derivative co -crystals from which the atomic structure 
coordinates of the invention are obtained can be obtained by conventional means as are well- 

15 known in the art of protein crystallography, including batch, liquid bridge, dialysis, and vapor 
diffusion methods (see, e.g., McPherson, 1982, Preparation and Analysis of Protein Crystals, 
John Wiley, New York; McPherson, 1990, Eur. J. Biochem. 189:1-23.; Weber, 1991, Adv. 
Protein Chem. 41:1-36.). Generally, native co-crystals are grown by dissolving substantially 
pure polypeptide encoding for the PapD-PapK co-complex or the FimH-FimC co-complex in 

20 an aqueous buffer containing a precipitant at a concentration just below that necessary to 

precipitate the protein. Examples of precipitants include, but are not limited to, polyethylene 
glycol, ammonium sulfate, 2-methyl-2,4-pentanediol, sodium citrate, sodium chloride, 
glycerol, isopropanol, lithium sulfate, sodium acetate, sodium formate, potassium sodium 
tartrate, ethanol, hexanediol, ethylene glycol, dioxane, t-butanol and combinations thereof. 

25 Water is removed by controlled evaporation to produce precipitating conditions, which are 
maintained until co-crystal growth ceases. 

In a preferred embodiment, native co-crystals are grown by vapor diffusion in hanging 
drops (McPherson, 1982, Preparation and Analysis of Protein Crystals, John Wiley, New 
York; McPherson, 1990, Eur. J. Biochem. 189:1-23.). In this method, the 
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polypeptide/precipitant solution is allowed to equilibrate in a closed container with a larger 

aqueous reservoir having a precipitant concentration optimal for producing crystals. 

Generally, less than about 25 pL of substantially pure polypeptide solution is mixed with an 

equal volume of reservoir solution, giving a precipitant concentration about half that required 
5 for crystallization. This solution is suspended as a droplet underneath a coverslip, which is 

sealed onto the top of the reservoir. The sealed container is allowed to stand, usually for 

about 2-6 weeks, until co-crystals grow. 

Heavy-atom derivative co-crystals can be obtained by soaking native co-crystals in 

mother liquor containing salts of heavy metal atoms. Further, heavy-atom derivative co- 
10 crystals can also be obtained from SeMet and/or SeCys mutants, as described above for 

native co-crystals. 

Mutant proteins may crystallize under slightly different crystallization conditions than 
wild-type protein, or under very different crystallization conditions, depending on the nature 
of the mutation, and its location in the protein. For example, a non-conservative mutation 

15 may result in alteration of the hydrophilicity of the mutant, which may in turn make the 

mutant protein either more soluble or less soluble than the wild-type protein. Typically, if a 
protein becomes more hydrophilic as a result of a mutation, it will be more soluble than the 
wild-type protein in an aqueous solution and a higher precipitant concentration will be needed 
to cause it to crystallize. Conversely, if a protein becomes less hydrophilic as a result of a 

20 mutation, it will be less soluble in an aqueous solution and a lower precipitant concentration 
will be needed to cause it to crystallize. If the mutation happens to be in a region of the 
protein involved in crystal lattice contacts, crystallization conditions may be affected in more 
unpredictable ways. 

The dimensions of a unit cell of a crystal are defined by six numbers, the lengths of 
25 three unique edges, a, b, and c, and three unique angles, a, (3 and y. The type of unit cell that 
comprises a crystal is dependent on the values of these variables. In one embodiment, the co- 
crystal of the PapD-PapK pilus chaperone-subunit co-complex has the space group of ¥2^2^ 
with unit cell dimensions of a = 62.1 ± 0.2 angstroms, b = 63.6 ± 0.2 angstroms and c = 92.7 
± 0.2 angstroms such that the three dimensional structure of the crystallized co-complex can 
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be determined to a resolution of from about 3 angstroms to about 2.4 angstroms or greater. In 
another embodiment, the co-crystals of the FimC-FimH chaperone-adhesin co-complex has 
the space group ¥4^2 of P4 3 with unit cell dimensions of a=b= 97.7 ± 0.2 angstroms and c = 
215.9 ± 0.2 angstroms such that the three-dimensional structure of the co-complex can be 

5 determined to a resolution of from about 3 angstroms to about 2.5 angstroms or greater. 

When a crystal is placed in an X-ray beam, the incident X-rays interact with the 
electron cloud of the molecules that make up the crystal, resulting in X-ray scatter. The 
combination of X-ray scatter with the lattice of the crystal gives rise to nonuniformity of the 
scatter; areas of high intensity are called diffracted X-rays. The angle at which diffracted 

10 beams emerge from the crystal can be computed by treating diffraction as if it were reflection 
from sets of equivalent, parallel planes of atoms in a crystal (Bragg's Law). The most 
obvious sets of planes in a crystal lattice are those that are parallel to the faces of the unit cell. 
These and other sets of planes can be drawn through the lattice points. Each set of planes is 
identified by three indices, hkl. The h index gives the number of parts into which the a edge 

15 of the unit cell is cut, the k index gives the number of parts into which the b edge of the unit 
cell is cut, and the 1 index gives the number of parts into which the c edge of the unit cell is 
cut by the set of hkl planes. Thus, for example, the 235 planes cut the a edge of each unit cell 
into halves, the b edge of each unit cell into thirds, and the c edge of each unit cell into fifths. 
Planes that are parallel to the be face of the unit cell are the 100 planes; planes that are 

20 parallel to the ac face of the unit cell are the 010 planes; and planes that are parallel to the ab 
face of the unit cell are the 001 planes. 

When a detector is placed in the path of the diffracted X-rays, in effect cutting into the 
sphere of diffraction, a series of spots, or reflections, are recorded to produce a "still" 
diffraction pattern. Each reflection is the result of X-rays reflecting off one set of parallel 

25 planes, and is characterized by an intensity, which is related to the distribution of molecules 
in the unit cell, and hkl indices, which correspond to the parallel planes from which the beam 
producing that spot was reflected. If the crystal is rotated about an axis perpendicular to the 
X-ray beam, a large number of reflections is recorded on the detector, resulting in a 
diffraction pattern. 
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The unit cell dimensions and space group of a crystal can be determined from its 
diffraction pattern. First, the spacing of reflections is inversely proportional to the lengths of 
the edges of the unit cell. Therefore, if a diffraction pattern is recorded when the X-ray beam 
is perpendicular to a face of the unit cell, two of the unit cell dimensions may be deduced 

5 from the spacing of the reflections in the x and y directions of the detector, the crystal-to- 
detector distance, and the wavelength of the X-rays. Those of skill in the art will appreciate 
that, in order to obtain all three unit cell dimensions, the crystal must be rotated such that the 
X-ray beam is perpendicular to another face of the unit cell. Second, the angles of a unit cell 
can be determined by the angles between lines of spots on the diffraction pattern. Third, the 

10 absence of certain reflections and the repetitive nature of the diffraction pattern, which may 
be evident by visual inspection, indicate the internal symmetry, or space group, of the crystal. 
Therefore, a crystal may be characterized by its unit cell and space group, as well as by its 
diffraction pattern. 

Once the dimensions of the unit cell are determined, the likely number of polypeptides 
1 5 in the asymmetric unit can be deduced from the size of the polypeptide, the density of the 
average protein, and the typical solvent content of a protein crystal, which is usually in the 
range of 30-70% of the unit cell volume. 

The diffraction pattern is related to the three-dimensional shape of the molecule by a 
Fourier transform. The process of determining the solution is in essence a re-focusing of the 
20 diffracted X-rays to produce a three-dimensional image of the molecule in the crystal. Since 
re-focusing of X-rays cannot be done with a lens at this time, it is done via mathematical 
operations. 

The sphere of diffraction has symmetry that depends on the internal symmetry of the 
crystal, which means that certain orientations of the crystal will produce the same set of 
25 reflections. Thus, a crystal with high symmetry has a more repetitive diffraction pattern, and 
there are fewer unique reflections that need to be recorded in order to have a complete 
representation of the diffraction. The goal of data collection, a dataset, is a set of consistently 
measured, indexed intensities for as many reflections as possible. A complete dataset is 
collected if at least 80%, preferably at least 90%, most preferably at least 95% of unique 
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reflections are recorded. In one embodiment, a complete dataset is collected using one 
crystal. In another embodiment, a complete dataset is collected using more than one crystal 
of the same type. 

Sources of X-rays include, but are not limited to, a rotating anode X-ray generator 
such as a Rigaku RU-200 or a beamline at a synchrotron light source, such as the Advanced 
Photon Source at Argonne National Laboratory. Suitable detectors for recording diffraction 
patterns include, but are not limited to, X-ray sensitive film, multiwire area detectors, image 
plates coated with phosphorus, and CCD cameras. Typically, the detector and the X-ray 
beam remain stationary, so that, in order to record diffraction from different parts of the 
crystal's sphere of diffraction, the crystal itself is moved via an automated system of 
moveable circles called a goniostat. 

One of the biggest problems in data collection, particularly from macromolecular 
crystals having a high solvent content, is the rapid degradation of the crystal in the X-ray 
beam. In order to slow the degradation, data is often collected from a crystal at liquid 
nitrogen temperatures. In order for a crystal to survive the initial exposure to liquid nitrogen, 
the formation of ice within the crystal must be prevented by the use of a cryoprotectant. 
Suitable cryoprotectants include, but are not limited to, low molecular weight polyethylene 
glycols, ethylene glycol, sucrose, glycerol, xylitol, and combinations thereof. Crystals may 
be soaked in a solution comprising the one or more cryoprotectants prior to exposure to liquid 
nitrogen, or the one or more cryoprotectants may be added to the crystallization solution. 
Data collection at liquid nitrogen temperatures may allow the collection of an entire dataset 
from one crystal. 

Once a dataset is collected, the information is used to determine the three-dimensional 
structure of the molecule in the crystal. However, this cannot be done from a single 
measurement of reflection intensities because certain information, known as phase 
information, is lost between the three-dimensional shape of the molecule and its Fourier 
transform, the diffraction pattern. This phase information must be acquired by methods 
described below in order to perform a Fourier transform on the diffraction pattern to obtain 
the three-dimensional structure of the molecule in the crystal. It is the determination of phase 
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information that in effect refocuses X-rays to produce the image of the molecule. 

One method of obtaining phase information is by isomorphous replacement, in which 
heavy-atom derivative crystals are used. In this method, the positions of heavy atoms bound 
to the molecules in the heavy-atom derivative crystal are determined, and this information is 
5 then used to obtain the phase information necessary to elucidate the three-dimensional 

structure of a native crystal. (Blundel et al., 1976, Protein Crystallography, Academic Press). 

Another method of obtaining phase information is by molecular replacement, which is 
a method of calculating initial phases for a new crystal of a polypeptide or polypeptide co- 
complex whose structure coordinates are unknown by orienting and positioning a polypeptide 

10 whose structure coordinates are known within the unit cell of the new crystal so as to best 
account for the observed diffraction pattern of the new crystal. Phases are then calculated 
from the oriented and positioned polypeptide and combined with observed amplitudes to 
provide an approximate Fourier synthesis of the structure of the molecules comprising the 
new crystal. (Lattman, 1985, Methods in Enzymology 115:55-77; Rossmann, 1972, "The 

15 Molecular Replacement Method," Int. Sci. Rev. Ser. No. 13, Gordon & Breach, New York). 

A third method of phase determination is multi-wavelength anomalous dispersion or 
MAD. In this method, X-ray diffraction data are collected at several different wavelengths 
from a single crystal containing at least one heavy atom with absorption edges near the 
energy of incoming X-ray radiation. The resonance between X-rays and electron orbitals 

20 leads to differences in X-ray scattering that permits the locations of the heavy atoms to be 

identified, which in turn provides phase information for a crystal of a polypeptide. A detailed 
discussion of MAD analysis can be found in Hendrickson, 1985, Trans. Am. Crystallogr. 
Assoc., 21:11; Hendrickson et al, 1990, EMBO J. 9:1665; and Hendrickson, 1991, Science 
4:91. 

25 A fourth method of determining phase information is single wavelength anomalous 

dispersion or SAD. In this technique, X-ray diffraction data are collected at a single 
wavelength from a single native or heavy-atom derivative crystal, and phase information is 
extracted using anomalous scattering information from atoms such as sulfur or chlorine in the 
native crystal or from the heavy atoms in the heavy-atom derivative crystal. The wavelength 
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of X-rays used to collect data for this phasing technique need not be close to the absorption 
edge of the anomalous scatterer. A detailed discussion of SAD analysis can be found in 
Brodersen et al., 2000, Acta Cryst, D56:43 1-441. 

A fifth method of determining phase information is single isomorphous replacement 

5 with anomalous scattering or SIRAS. This technique combines isomorphous replacement 
and anomalous scattering techniques to provide phase information for a crystal of a 
polypeptide. X-ray diffraction data are collected at a single wavelength, usually from a single 
heavy-atom derivative crystal. Phase information obtained only from the location of the 
heavy atoms in a single heavy-atom derivative crystal leads to an ambiguity in the phase 

10 angle, which is resolved using anomalous scattering from the heavy atoms. Phase 

information is therefore extracted from both the location of the heavy atoms and from 
anomalous scattering of the heavy atoms. A detailed discussion of SIRAS analysis can be 
found in North, 1965, Acta Cryst. 18:212-216; Matthews, 1966, Acta Cryst. 20:82-86. 
Once phase information is obtained, it is combined with the diffraction data to 

15 produce an electron density map, an image of the electron clouds that surround the molecules 
in the unit cell. The higher the resolution of the data, the more distinguishable are the 
features of the electron density map, e.g., amino acid side chains and the positions of 
carbonyl oxygen atoms in the peptide backbones, because atoms that are closer together are 
resolvable. A model of the macromolecule is then built into the electron density map with the 

20 aid of a computer, using as a guide all available information, such as the polypeptide 

sequence and the established rules of molecular structure and stereochemistry. Interpreting 
the electron density map is a process of finding the chemically realistic conformation that fits 
the map precisely. 

After a model is generated, a structure is refined. Refinement is the process of 
25 minimizing the function <3>, which is the difference between observed and calculated intensity 
values (measured by an R- factor), and which is a function of the position, temperature factor, 
and occupancy of each non-hydrogen atom in the model. This usually involves alternate 
cycles of real space refinement, i.e., calculation of electron density maps and model building, 
and reciprocal space refinement, i.e., computational attempts to improve the agreement 
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between the original intensity data and intensity data generated from each successive model. 
Refinement ends when the function <J> converges on a minimum wherein the model fits the 
electron density map and is stereochemically and conformationally reasonable. During 
refinement, ordered solvent molecules are added to the structure. 

5 The atomic structure coordinates and machine readable media of the invention have a 

variety of uses. The present invention encompasses the structure coordinates and other 
information, e.g., amino acid sequence, connectivity tables, vector-based representations, 
temperature factors, etc., used to generate the three-dimensional structures of the polypeptides 
for use in the software programs described below and other software programs. For example, 

10 the coordinates are useful for solving the three-dimensional X-ray diffraction and/or solution 
structures of other proteins, including mutant PapD-PapK chaperone-subunit or FimC-FimH 
chaperone-adhesin co-complexes, PapD-PapK chaperone-subunit co-complexes or FimC- 
FimH chaperone-adhesin co-complexes that are further associated with other molecules, and 
unrelated proteins, to high resolution. Structural information may also be used in a variety of 

15 molecular modeling and computer-based screening applications to, for example, intelligently 
design mutants of the crystallized PapD-PapK chaperone-subunit co-complex or the 
crystallized FimC-FimH chaperone-adhesin co-complex that have altered biological activity 
and to computationally design and identify compounds that bind the Gi beta-strand of a 
periplasmic chaperone, the amino-terminal end of a pilus subunit. Such compounds may be 

20 used as lead compounds in pharmaceutical efforts to identify compounds that inhibit pilus 
biogenesis as a therapeutic approach toward the treatment of several types of disease caused 
by pathogenic Gram-negative bacteria such as Escherichia coli, Haemophilus influenzae, 
Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, Yersinia enterocolitica, 
Yersinia perstis, Helicobacter pylori and Klebsiella pneumoniae. 

25 In a further aspect of the invention, such potential antibacterial compounds are 

evaluated for their capacity to prevent or treat a bacterial infection. These methods comprise 
designing and synthesizing candidate antibacterial compounds using the atomic coordinates 
of the three dimensional structure of such co-crystals and screened for its ability to bind to 
pilus subunits thereby inhibiting or preventing pilis biogenesis. The antibacterial activity of 
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the compound is determined by assaying the bacterium for infectivity or monitoring the pilus 
for activity. Such compounds which are able to prevent or inhibit pilus biogenesis or the 
ability of the bacterial pilus to infect a host tissue can be used in the pharmaceutical 
compositions of the present invention. 

5 Additionally, the invention encompasses machine readable media embedded with the 

three-dimensional structures of the models described herein, or with portions thereof. As 
used herein, "machine readable medium" refers to any medium that can be read and accessed 
directly by a computer or scanner. Such media include, but are not limited to: magnetic 
storage media, such as floppy discs, hard disc storage medium and magnetic tape; optical 

10 storage media such as optical discs or CD-ROM; electrical storage media such as RAM or 
ROM; and hybrids of these categories such as magnetic/optical storage media. Such media 
further include paper on which is recorded a representation of the atomic structure 
coordinates, e.g., Cartesian coordinates, that can be read by a scanning device and converted 
into a three-dimensional structure with an Optical Character Recognition (OCR). 

15 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon the atomic structure coordinates of the 
invention or portions thereof and/or X-ray diffraction data. The choice of the data storage 
structure will generally be based on the means chosen to access the stored information. In 
addition, a variety of data processor programs and formats can be used to store the sequence 

20 and X-ray data information on a computer readable medium. Such formats include, but are 
not limited to, Protein Data Bank ("PDB") format (Research Collaboratory for Structural 
Bioinformatics; http://www.rcsb.Org/pdb/docs/format/pdbguide2.2/guide2.2_frame.html); 
Cambridge Crystallographic Data Centre format 

(http.V/www.ccdc.cam.ac.uk/support/csd_doc/volume3/z323.html); Structure-data ("SD") file 
25 format (MDL Information Systems, Inc.; Dalby et al, 1992, J. Chem. Inf. Comp. Sci. 32:244- 
255), and line-notation, e.g., as used in SMILES (Weininger, 1988, J. Chem. Inf. Comp. Sci. 
28:31-36). Methods of converting between various formats read by different computer 
software will be readily apparent to those of skill in the art, e.g. , BABEL (v. 1 .06, Walters & 
Stahl, ©1992, 1993, 1994; http://www.bmnel.ac.uk/departments/chem/babel.htm.) All 
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format representations of the polypeptide coordinates described herein, or portions thereof, 
are contemplated by the present invention. By providing computer readable medium having 
stored thereon the atomic coordinates of the invention, one of skill in the art can routinely 
access the atomic coordinates of the invention, or portions thereof, and related information for 

5 use in modeling and design programs, described in detail below. 

While Cartesian coordinates are important and convenient representations of the 
three-dimensional structure of a polypeptide, those of skill in the art will readily recognize 
that other representations of the structure are also useful. Therefore, the three-dimensional 
structure of a polypeptide, as discussed herein, includes not only the Cartesian coordinate 

10 representation, but also all alternative representations of the three-dimensional distribution of 
atoms. For example, atomic coordinates may be represented as a Z-matrix, wherein a first 
atom of the protein is chosen, a second atom is placed at a defined distance from the first 
atom, a third atom is placed at a defined distance from the second atom so that it makes a 
defined angle with the first atom. Each subsequent atom is placed at a defined distance from 

15 a previously placed atom with a specified angle with respect to the third atom, and at a 
specified torsion angle with respect to a fourth atom. Atomic coordinates may also be 
represented as a Patterson function, wherein all interatomic vectors are drawn and are then 
placed with their tails at the origin. This representation is particularly useful for locating 
heavy atoms in a unit cell. In addition, atomic coordinates may be represented as a series of 

20 vectors having magnitude and direction and drawn from a chosen origin to each atom in the 
polypeptide structure. Furthermore, the positions of atoms in a three-dimensional structure 
may be represented as fractions of the unit cell (fractional coordinates), or in spherical polar 
coordinates. 

Additional information, such as thermal parameters, which measure the motion of 
25 each atom in the structure, chain identifiers, which identify the particular chain of a multi- 
chain protein or protein co-complex in which an atom is located, and connectivity 
information, which indicates to which atoms a particular atom is bonded, is also useful for 
representing a three-dimensional molecular structure. 
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Uses of the Atomic Structure Coordinates 

Structure information, typically in the form of the atomic structure coordinates, can be 
used in a variety of computational or computer-based methods to, for example, design, screen 
for and/or identify compounds that bind the crystallized polypeptide or a portion or fragment 
5 thereof, or to intelligently design mutants that have altered biological properties. 

In one embodiment, the co-crystals and structure coordinates obtained therefrom are 
useful for identifying and/or designing compounds that bind PapD, PapK, FimC or FimH as 
an approach towards developing new therapeutic agents. For example, a high resolution 
X-ray structure will often show the locations of ordered solvent molecules around the protein, 
10 and in particular at or near putative binding sites on the protein. This information can then be 
used to design molecules that bind these sites, the compounds synthesized and tested for 
binding in biological assays. Travis, 1993, Science 262:1374. 

In another embodiment, the structures are probed with a plurality of molecules to 
determine their ability to bind to PapD, PapK, FimC or FimH at various sites. Such 
15 compounds can be used as targets or leads in medicinal chemistry efforts to identify, for 
example, inhibitors of potential therapeutic importance. 

In specific embodiments described herein, the high resolution X-ray structures of the 
PapD/PapK and FimC/FimH co-complexes show details of the interactions between PapD 
and PapK, and between FimC and FimH, respectively. This information can be used to 
20 design molecules that bind to the sites of interaction, thereby blocking co-complex formation. 
In addition, the X-ray structure of the FimC/FimH co-complex has a C-HEGA molecule 
bound in the mannose-binding pocket of FimH, which can be used to model compounds that 
bind to the lectin and inhibit the FimH interaction with mannose oligosaccharides on host 
cells. 

25 In yet another embodiment, the structures can be used to computationally screen small 

molecule data bases for chemical entities or compounds that can bind in whole, or in part, to 
PapD, PapK, FimC or FimH. In this screening, the quality of fit of such entities or 
compounds to the binding site may be judged either by shape complementarity or by 
estimated interaction energy. Meng et al, 1992, J. Comp. Chem. 13:505-524. 
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The design of compounds that bind to PapD, PapK, FimC or FimH according to this 
invention generally involves consideration of two factors. First, the compound must be 
capable of physically and structurally associating with PapD, PapK, FimC or FimH. This 
association can be covalent or non-covalent. For example, covalent interactions may be 

5 important for designing suicide or irreversible inhibitors of a protein. Non-covalent 

molecular interactions important in the association of PapD with PapK or of FimC with FimH 
include hydrogen bonding, ionic interactions and van der Waals and hydrophobic 
interactions. Second, the compound must be able to assume a conformation that allows it to 
associate with PapD, PapK, FimC or FimH. Although certain portions of the compound will 

10 not directly participate in this association with the protein, those portions may still influence 
the overall conformation of the molecule. This, in turn, may have a significant impact on 
potency. Such conformational requirements include the overall three-dimensional structure 
and orientation of the chemical group or compound in relation to all or a portion of the 
binding site, or the spacing between functional groups of a compound comprising several 

1 5 chemical groups that directly interact with the protein. 

The potential inhibitory or binding effect of a chemical compound on PapD, PapK, 
FimC or FimH may be analyzed prior to its actual synthesis and testing by the use of 
computer modeling techniques. If the theoretical structure of the given compound suggests 
insufficient interaction and association between it and the protein, synthesis and testing of the 

20 compound is unnecessary. However, if computer modeling indicates a strong interaction, the 
molecule may then be synthesized and tested for its ability to bind to the protein and inhibit 
its activity. In this manner, synthesis of ineffective compounds may be avoided. 

An inhibitory or other binding compound of PapD, PapK, FimC or FimH may be 
computationally evaluated and designed by means of a series of steps in which chemical 

25 groups or fragments are screened and selected for their ability to associate with the individual 
binding pockets or interface surfaces of each of the proteins. One skilled in the art may use 
one of several methods to screen chemical groups or fragments for their ability to associate 
with PapD, PapK, FimC or FimH. This process may begin by visual inspection of, for 
example, the protein/protein interfaces or the mannose-binding site of FimH on the computer 
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screen based on the PapD/PapK or FimC/FimH co-complex coordinates. Selected fragments 
or chemical groups may then be positioned in a variety of orientations, or docked, at an 
individual surface of PapD, PapK, FimC or FimH that participates in a protein/protein 
interface in the co-complex, or in the mannose-binding pocket of FimH, as defined supra. 

5 Docking may be accomplished using software such as QUANTA and SYB YL, followed by 
energy minimization and molecular dynamics with standard molecular mechanics forcefields, 
such as CHARMM and AMBER. 

Specialized computer programs may also assist in the process of selecting fragments 
or chemical groups. These include: 

10 1 . GRID (Goodford, 1985, J. Med. Chem. 28:849-857). GRID is available from 

Oxford University, Oxford, UK; 

2. MCSS (Miranker & Karplus, 1991, Proteins: Structure, Function and Genetics 
11:29-34). MCSS is available from Molecular Simulations, Burlington, MA; 

3. AUTODOCK (Goodsell & Olsen, 1990, Proteins: Structure, Function, and 
15 Genetics 8:1 95-202). AUTODOCK is available from Scripps Research Institute, La Jolla, 

CA;and 

4. DOCK (Kuntz et al, 1982, J. Mol. Biol. 161:269-288). DOCK is available 
from University of California, San Francisco, CA. 

Once suitable chemical groups or fragments have been selected, they can be 
20 assembled into a single compound or inhibitor. Assembly may proceed by visual inspection 
of the relationship of the fragments to each other in the three-dimensional image displayed on 
a computer screen in relation to the structure coordinates of PapD, PapK, FimC or FimH. 
This would be followed by manual model building using software such as QUANTA or 
SYBYL. 

25 Useful programs to aid one of skill in the art in connecting the individual chemical 

groups or fragments include: 

1 . CAVEAT (Bartlett et al. ,1989, 'CAVEAT: A Program to Facilitate the 
Structure-Derived Design of Biologically Active Molecules'. In Molecular Recognition in 
Chemical and Biological Problems', Special Pub., Royal Chem. Soc. 78:182-196). CAVEAT 
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is available from the University of California, Berkeley, CA; 

2. 3D Database systems such as MACCS-3D (MDL Information Systems, San 
Leandro, Calif.). This area is reviewed in Martin, 1992, J. Med. Chem. 35:2145-2154); and 

3. HOOK (available from Molecular Simulations, Burlington, Mass.). 

5 Instead of proceeding to build an inhibitor of PapD/PapK or FimC/FimH co-complex 

formation, or of mannose binding to FimH, in a step-wise fashion one fragment or chemical 
group at a time, as described above, PapD-, PapK-, FimC- or FimH-binding compounds may 
be designed as a whole or 'de novo' using either an empty binding site or the surface of a 
protein that participates in protein/protein interactions in a co-complex, or optionally 

10 including some portion(s) of a known inhibitor(s) or of the second protein in the co-complex 
that participates in a particular protein/protein interaction at an interface. These methods 
include: 

1. LUDI (Bohm, 1992, J. Comp. Aid. Molec. Design 6:61-78). LUDI is available 
from Molecular Simulations, Inc., San Diego, CA; 
15 2. LEGEND (Nishibata & Itai, 1991, Tetrahedron 47:8985). LEGEND is 

available from Molecular Simulations, Burlington, Mass.; and 

3. LeapFrog (available from Tripos, Inc., St. Louis, Mo.). 
Other molecular modeling techniques may also be employed in accordance with this 
invention. See, e.g., Cohen et al, 1990, J. Med. Chem. 33:883-894. See also, Navia & 
20 Murcko, 1992, Current Opinions in Structural Biology 2:202-210. 

Once a compound has been designed or selected by the above methods, the efficiency 
with which that compound may bind to PapD, PapK, FimC or FimH may be tested and 
optimized by computational evaluation. For example, a compound that has been designed or 
selected to function as a FimH mannose-binding inhibitor must also preferably occupy a 
25 volume not overlapping the volume occupied by the mannose-binding site residues when 
mannose is bound. An effective inhibitor of PapD/PapK or FimC/FimH co-complex 
formation, or of FimH mannose binding must preferably demonstrate a relatively small 
difference in energy between its bound and free states (i.e., it must have a small deformation 
energy of binding). Thus, the most efficient inhibitors should preferably be designed with a 
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deformation energy of binding of not greater than about 10 kcal/mol, preferably, not greater 
than 7 kcal/mol. Inhibitors may interact with the protein in more than one conformation that 
is similar in overall binding energy. In those cases, the deformation energy of binding is 
taken to be the difference between the energy of the free compound and the average energy of 

5 the conformations observed when the inhibitor binds to the protein. 

A compound selected or designed for binding to PapD, PapK, FimC or FimH may be 
further computationally optimized so that in its bound state it would preferably lack repulsive 
electrostatic interaction with the target protein. Such non-complementary electrostatic 
interactions include repulsive charge-charge, dipole-dipole and charge-dipole interactions. 

10 Specifically, the sum of all electrostatic interactions between the inhibitor and the protein 
when the inhibitor is bound to it preferably make a neutral or favorable contribution to the 
enthalpy of binding. 

Specific computer software is available in the art to evaluate compound deformation 
energy and electrostatic interaction. Examples of programs designed for such uses include: 

15 Gaussian 92, revision C (Frisch, Gaussian, Inc., Pittsburgh, PA. (©1992); AMBER, version 
4.0 (Kollman, University of California at San Francisco, ©1994); QUANTA/CHARMM 
(Molecular Simulations, Inc., Burlington, MA, ©1994); and Insight II/Discover (Biosym 
Technologies Inc., San Diego, CA, ©1994). These programs may be implemented, for 
instance, using a computer workstation, as are well-known in the art. Other hardware systems 

20 and software packages will be known to those skilled in the art. 

Once a PapD-, PapK-, FimC- or FimH-binding compound has been optimally selected 
or designed, as described above, substitutions may then be made in some of its atoms or 
chemical groups in order to improve or modify its binding properties. Generally, initial 
substitutions are conservative, i.e., the replacement group will have approximately the same 

25 size, shape, hydrophobicity and charge as the original group. One of skill in the art will 

understand that substitutions known in the art to alter conformation should be avoided. Such 
altered chemical compounds may then be analyzed for efficiency of binding to PapD, PapK, 
FimC or FimH by the same computer methods described in detail above. 

Because PapD/PapK co-complexes may crystallize in more than one crystal form, the 
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structure coordinates of PapD/PapK co-complex, of PapD alone, of PapK alone, or of 
portions thereof, are particularly useful to solve the structure of those other co-crystal forms 
of PapD/PapK co-complex. They may also be used to solve the structure of mutants, of 
PapD/PapK co-complex further complexed to another molecule, or of the crystalline form of 

5 any other protein or protein co-complex with significant amino acid sequence homology to 
any functional domain of PapD or PapK. Similarly, the structure coordinates of FimC/FimH 
co-complex, of FimC alone, of FimH alone, or of portions thereof, are particularly useful to 
solve the structure of other co-crystal forms of FimC/FimH co-complex. They may also be 
used to solve the structure of mutants, of FimC/FimH co-complex further complexed to 

10 another molecule, or of the crystalline form of any other protein or protein co-complex with 
significant amino acid sequence homology to any functional domain of FimC or FimH. 

One method that may be employed for this purpose is molecular replacement. In this 
method, the unknown co-crystal structure, whether it is another co-crystal form of a 
PapD/PapK or FimC/FimH co-complex, a mutant, a PapD/PapK or FimC/FimH co-complex 

15 that is further complexed to another molecule, or the crystal of some other protein or protein 
co-complex with significant amino acid sequence homology to any functional domain of one 
of the proteins in the co-complex crystal, may be determined using phase information from 
the PapD/PapK or FimC/FimH structure coordinates, respectively. This method will provide 
an accurate three-dimensional structure for the unknown protein or protein co-complex in the 

20 new crystal more quickly and efficiently than attempting to determine such information ab 
initio. 

If an unknown crystal form has the same space group as and similar cell dimensions 
to the known co-complex crystal form, then the phases derived from the known crystal form 
can be directly applied to the unknown crystal form, and in turn, an electron density map for 
25 the unknown crystal form can be calculated. Difference electron density maps can then be 
used to examine the differences between the unknown crystal form and the known crystal 
form. A difference electron density map is a subtraction of one electron density map, e.g., 
that derived from the known crystal form, from another electron density map, e.g., that 
derived from the unknown crystal form. Therefore, all similar features of the two electron 
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density maps are eliminated in the subtraction and only the differences between the two 
structures remain. For example, if the unknown crystal form is of a FimC/FimH co-complex 
that is further complexed with a mannose analog in the FimH mannose binding site, then a 
difference electron density map between this map and the map derived from the native, 
5 uncomplexed crystal will ideally show only the electron density of the differences between C- 
HEGA and the mannose analog. Similarly, if amino acid side chains have different 
conformations in the two crystal forms, then those differences will be highlighted by peaks 
(positive electron density) and valleys (negative electron density) in the difference electron 
density map, making the differences between the two crystal forms easy to detect. However, 

10 if the space groups and/or cell dimensions of the two crystal forms are different, then this 

approach will not work and molecular replacement must be used in order to derive phases for 
the unknown crystal form. 

All of the complexes referred to above may be studied using well-known X-ray 
diffraction techniques and may be refined versus 1 .5 A or higher to 3 A resolution X-ray date 

15 to an R value of about 0.20 or less using computer software, such as X-PLOR (Yale 

University, (c) 1992, distributed by Molecular Simulations, Inc.). See, e.g., Blundel et al, 
1976, Protein Crystallography, Academic Press.; Methods in Enzymologv, vol. 1 14 & 115, 
Wyckoff et al, eds., Academic Press, 1985. This information may thus be used to optimize 
known classes of inhibitors of PapD/PapK or FimC/FimH co-complex formation or of 

20 mannose binding to FimH, and more importantly, to design and synthesize novel classes of 
inhibitors of PapD/PapK or FimC/FimH co-complex formation or of mannose binding to 
FimH. 

The structure coordinates of PapD/PapK or FimC/FimH mutant co-complexes will 
also facilitate the identification of related protein co-complexes analogous to the PapD/PapK 
25 or FimC/FimH co-complexes in function, structure or both, thereby further leading to novel 
therapeutic modes for treating or preventing gram-negative bacteria-mediated diseases. 

Subsets of the atomic structure coordinates can be used in any of the above methods. 
Particularly useful subsets of the coordinates include, but are not limited to, coordinates of 
single domains, coordinates of residues lining an active site, coordinates of residues that 
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participate in important protein-protein contacts at an interface, and C a coordinates. For 
example, the coordinates of one domain of a protein that contains the active site may be used 
to design inhibitors that bind to that site, even though the protein is fully described by a larger 
set of atomic coordinates. Therefore, as described in detail for the specific embodiments, 
5 below, a set of atomic coordinates that define the entire polypeptide chain, although useful for 
many applications, do not necessarily need to be used for the methods described herein. 

Uses of subsets of atomic coordinates in specific embodiments 

The structure coordinates of the present invention, and subsets thereof, are useful for 

10 designing or screening for compounds that bind to the PapD, PapK, FimC or FimH proteins. 
The high resolution X-ray structures of the PapD/PapK and FimC/FimH co-complexes of the 
present invention show details of the interactions between PapD and PapK, and between 
FimC and FimH, respectively. This information can be used to design and/or screen for 
compounds that bind to the sites of interaction, thereby blocking co-complex formation and 

15 pilus assembly. In addition, the X-ray structure of the FimC/FimH co-complex has a C- 

HEGA molecule bound in the mannose-binding pocket of FimH, which can be used to model 
compounds that bind to the lectin domain and inhibit the FimH interaction with mannose on 
host cells. 

Those of skill in the art will recognize that the complete set of PapD/PapK co- 
20 complex structure coordinates and the complete set of FimC/FimH co-complex structure 
coordinates will be useful in the methods of the present invention. Those of skill in the art 
will further recognize that the coordinates of PapD, PapK, FimC and FimH will be useful 
separate from the coordinates of the protein with which each protein forms a co-complex in 
the crystals. In addition, those of skill in the art will recognize that subsets of the structure 
25 coordinates of each protein, such as the coordinates of a single domain or interface or binding 
pocket, will be useful in the methods of the invention, as discussed in more detail, below. 

In one embodiment, the PapK coordinates, or the subset of PapK coordinates that are 
the residues in the hydrophobic groove region of PapK (the Kl region), where the G r beta- 
strand of PapD interacts with PapK in the co-complex crystal structure, are useful for 
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designing and/or screening for compounds that bind in the groove in order to prevent pilus 
assembly. A subset of structure coordinates of PapK useful in this embodiment of the 
invention include those of Val ,6K , Leu 21K , Val 26K , Phe 27K , Phe 47K , Ile 49K , Phe 67K , Ile 91K , Ile 93K , 
Tyr 146K , Ala 150K , Thr ,51K , Phe 1S2K , Leu 154K and Tyr 156K , as numbered in Fig. 3. 

5 In another embodiment, the PapD coordinates, or the subset of PapD coordinates that 

are the G] beta-strand residues (the Dl region), which interacts with the Kl region by fitting 
into the hydrophobic groove of PapK in the PapD/PapK co-complex structure, are useful for 
designing compounds that have an analogous shape, such that the compounds fit into the 
PapK groove and inhibit pilus assembly. A subset of beta-strand structure coordinates of 

10 PapD useful in this embodiment include those of Leu 103D , Gln 104D , Ile 105D , Ala 106D and Leu 107D . 

In yet another embodiment, the PapD coordinates, or a subset of PapD coordinates in 
the D2 region, and the PapK coordinates, or a subset of PapK coordinates in the K2 region, 
which participate in a second interface of the PapD/PapK co-complex, are useful for 
designing and/or screening for compounds that disrupt this interaction and prevent PapD- 

15 PapK co-complex formation. A subset of PapK coordinates useful for this embodiment of the 
invention include those of residues Val 59K , Gly 60K , Lys 61K and Arg 157K . A subset of PapD 
coordinates useful for this embodiment of the invention include those of residues Thr 152D , 
Ile 154D , Glu 164D , Glu 165D , Thr ,70D , Ile I94D and Arg 200D 

In another embodiment, the FimH coordinates, a subset of the FimH coordinates that 

20 are the pilin domain of FimH, or a subset of FimH coordinates that are the residues in the 
hydrophobic groove region of the pilin domain, where the G] beta-strand of FimC interacts 
with FimH, are useful for designing and/or screening for compounds that inhibit this 
interaction, thereby inhibiting pilus formation in type 1 pili. A subset of FimH structure 
coordinates useful in this embodiment of the invention include those of residues Ala 150H , 

25 Asn 1S2H , Val 154H and Val ,56H , as numbered in Fig. 8 . 

In yet another embodiment, the FimC coordinates, or a subset of FimC coordinates 
that are the residues of the Gj beta-strand that interact with the hydrophobic groove region of 
FimH are useful for designing compounds that have an analogous shape, such that the 
compounds fit into the FimH groove and inhibit type 1 pilus assembly. A subset of FimC 
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structure coordinates useful in this embodiment of the invention include those of residues 
Ile 103C , Leu 105C andIle 107C . 

In another embodiment, the FimH coordinates, a subset of FimH coordinates that are 
the lectin domain of FimH, or a subset of FimH coordinates that comprise the mannose 

5 binding pocket of the lectin domain are useful for designing and/or screening for compounds 
that fit into the mannose binding pocket and block the interaction of FimH with host cell 
mannose oligosaccharides, thus preventing adhesion to host cells and E. coli pathogenesis. A 
subset of structure coordinates useful in this embodiment of the invention include those of 
residues Phe 1H , Asn 46H , Asp 4 ™ Tyr 4SH , Ile 52H , Asp 54H , Gln 133H , Asn 135H , Tyr 137H , Asn 138H , Asp 140H 

10 andPhe ,42H . 

The following examples illustrate the invention, but are not to be taken as limiting the 
various aspects of the invention so illustrated. 

15 EXAMPLES 

Example 1; The PapD-PapK Chaperone-Subunit Co-Complex 

Expression of the PapD-PapK Co-Complex. The PapD-PapK co-complex was 
overexpressed in E.coli and periplasms were prepared as described by Slonim et al. (EMBOJ. 

20 1992, 11:4747). Periplasms were then subjected to cation exchange (15S Source 
(Pharmacia)) followed by hydrophobic interaction (15PHE Source (Pharmacia)) 
chromatography to yield pure co-complex. Expression of selenomethionine (Se-Met) PapD- 
PapK co-complexes was carried out in the E.coli methionine-auxotroph DL41 strain as 
described by Hendrickson et al. (EMBO J. 1 990, 9:1 665) and purified as was the wild-type 

25 co-complex. The purified wild-type or Se-Met PapD-PapK co-complexes were dialyzed 
against 20 mM KMES pH 6.7 and concentrated to -12 mg/ml. Co-crystals were grown by 
vapor diffusion using the hanging drop method against a reservoir containing 10-15% (w/v) 
PEG 6000, 100 mM potassium acetate, and 200-400 mM sodium acetate at pH 4.6 [A. 
McPherson, Eur. J. Biochem. 189, 23 (1990)] and appeared within three to five days. The 
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co-crystals were cryoprotected by increasing the concentration of PEG 6000 to 25% (w/v) 
and flash-cooled to liquid nitrogen temperature. Co-crystals were in space group P2 1 2 1 2„ 
with cell dimensions a = 62.12 ± 0.2 A, b = 63.69 ± 0.2 A, and c = 92.72 ± 0.2 A, and with 
one co-complex in the asymmetric unit. Table 4 contains a summary of the data collected and 
refinement statistics. 
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(Fig. 6E). Thus the donor strand complementation by the G 2 beta-strand of PapD shields the 
hydrophobic core of the pilin from exposure to the aqueous milieu of the periplasm. 

The Kl-Dl interaction also involves contacts at the end of the groove nearest the cleft 
of the chaperone. These interactions consist of hydrophobic and polar contacts between the 

5 Al strand of PapK and the Al l9 A2 1 and Q strands of PapD (Figs. 6A and 6B). The COOH- 
terminal carboxylate of PapK anchors the subunit into the cleft of PapD by hydrogen bonding 
to the invariant Arg 8 and Lys 112 residues of PapD as well as to the Oy hydroxyl of highly 
conserved Thr 152 (Figs. 6C and 6D). 

Site K2 is formed primarily by residues in helix 3 10 C and the COOH-terminal Arg 157 

10 side chain of PapK (Figs. 6C and 6D). This interface is less extensive than site Kl (455 A 2 ). 
Residues in site K2 interact with residues in the C 2 and D 2 strands and with the F 2 -G 2 loop of 
domain 2 of PapD (Site D2). The K2-D2 interface includes hydrogen bonds between Thr 57 of 
PapK and the main-chain carbonyls of Glu 164 and Glu 165 of PapD, as well as polar and 
hydrophobic contacts involving Lys 61 and He 62 of PapK and Arg 200 and He 154 of PapD. 

15 

Example 2: Preparation and comparison of FimA subunits 
from different strains of E. coli. 

Genomic DNA was prepared from overnight broth cultures of 59 uropathogenic E. 

coli strains using the Puregene DNA Isolation Kit (Minneapolis, MN). DNA was amplified 

20 by PCR using Taq polymerase (Perkin Elmer) using the following primers: 5'- 

CATCGCTGGCACAGGAAGGAGC-3 ' (SEQ ID NO: 53) and 

5 ' -GTTGGT ATGACCCGC ATC AATCGC-3 ' (SEQ ID NO: 54) that flank the fimA locus, 
under the following conditions : cycle 1 (95°C for 1 min ), cycle 2-30 (95°C for 30 sec, 50°C 
for 30 sec, 72°C for 2 min.) in the presence of 3.0 mM MgCl 2 . The FimA amplified 
25 fragments were purified with a QIAquick Purification Kit (Qiagen, Germany), sequenced 

directly without subcloning using the dRhodamine Terminator Cycle Sequencing Kit (Perkin 
Elmer, Norwalk, CT) and analyzed on the ABI 373 Automated DNA Sequencer (PE Applied 
Biosy stems, Foster City, CA). The FimA sequences were aligned and compared using the 
Lasergene software program (DNAStar). 
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Example 3: Structure of FimH in the FimH-FimC Co-Crvstal 

FimH is folded into two domains of the all-beta class. The NH 2 -terminal mannose- 
binding domain comprises residues 1H - 156H, and the COOH-terminal pilin domain which 
is used to anchor the adhesin to the pilus comprises residues 160H - 279H. A short extended 

5 linker (residues 157H - 159H) connects the two domains. FimC in the co-complex has the 
same overall structure as free FimC. The pilin domain of FimH binds in the cleft of the 
chaperone, but mostly to the chaperone's NH 2 -terminal domain. 

The lectin domain of FimH is an 1 1 -stranded elongated beta-barrel with a jelly roll- 
like topology (Figure 8B). A pocket capable of accommodating a mono-mannose unit is 

10 located at the tip of the domain, distal from the connection to the pilin domain (Figure 9B). 
The bottom of the pocket is lined with asparagine, glutamine and aspartic acid residues in 
three loop regions which are typical carbohydrate binding side chains (Figure 10A). A 
molecule of cyclohexylbutanoyl-AMiydroxyethyl-D-glucamide (C-HEGA) is bound in this 
pocket. C-HEGA is not a known inhibitor of FimH mannose binding but was needed in the 

15 crystallization to produce useful co-crystals of FimC-FimH co-complex. The glucamide 

moiety of C-HEGA is blocked at CI and cannot form a pyranose, but is bent to approach the 
pyranose conformation. The C2, C3, C4 and C6 hydroxyl groups of C-HEGA are enclosed 
within the pocket, whereas the C5 hydroxyl and cyclohexylbutanoyl-N-hydroxyethyl groups 
point out from the pocket and are solvent exposed. Residues Asp 54H , Gln 133H , Asn 135H , Asp 140H 

20 and the NH 2 -terminal amino group of FimH (Figure 10A) are hydrogen bonded to the 

glucamide moiety of C-HEGA. FimH from a urinary tract E. coli isolate which has a lysine 
instead of asparagine at position 135H produces type 1 pili but is unable to mediate mannose 
sensitive hemagglutination of guinea pig erythrocytes (S. Langermann, unpublished results). 
Also, a mutation at residue 136H has been reported to completely block mannose binding. 

25 See Schembri et al., FEMS Microbiol. Lett., 137, 257 (1996). 

The pilin domain of FimH has the same immunoglobulin-like topology as the NH 2 - 
terminal domain of FimC, except that the seventh strand of the fold is missing. Two anti- 
parallel beta-sheets (strands A' BED' and D"CF) pack against each other to form a beta-barrel 
that is similar to, but distinct from, immunoglobulin barrels. As in the chaperones, strand 

30 switching occurs at the edges of the sheets. In the chaperones, the Al strand of the NH 2 - 
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terminal domain switches between the two sheets of the barrel. The first strand of the pilin 
domain exhibits a similar switch, but due to the lack of a seventh strand, the second half of 
the A strand is not involved in main chain hydrogen bonding within the domain. The D 
strand of the chaperones as well as of the FimH pilin domain also switches, but in the pilin 

5 domain the switch is an 8-residue loop instead of the cis-proline bulge found in the 

chaperones. The C-D loop and the D'-D" connection pack against each other and close the 
top of the barrel. The other side of the barrel, defined by the A and F edge strands, is open. 
Due to the absence of a seventh strand a deep scar is created on the surface of the domain. 
Residues that would be part of the hydrophobic core of an intact, seven-stranded PapD-like 

10 domain instead line a deep hydrophobic crevice on the surface of the pilin domain. 

Example 4: FimC-FimH Co-crystal Structure 

FimC-FimH co-crystals were grown by hanging drop vapor diffusion by mixing 2 ul 
of a protein solution (4 mg of FimC-FimH co-complex per milliliter pre-equiliabrated in 300 

15 mM of HEGA) with 2 ul of reservoir solution containing 1 M ammonium sulfate in 0.1 M 
tris-HCl buffer (pH 8.2). The structure of the FimC-FimH co-complex was solved to 2.5 A 
(Table 5). Eight copies of the FimC-FimH co-complex in the asymmetric unit were arranged 
as two sets of four molecules related by approximate 4, screw axes. Electron density was 
excellent for one set of molecules (Figure 9A), allowing applicants to trace the entire co- 

20 complex. For the second set of molecules, electron density was poorer but allowed for 
unambiguous placement of a copy of the initially traced co-complex. 

Two seleno-methionine FimC-FimH co-crystals were used to collect MAD (W. A. 
Hendrickson, Science 254: 51 (1991)) data on BM14 of the ESRF. Data were recorded at 
each of 3 wavelengths corresponding to the peak of the Se white line, the point of inflexion of 

25 the K absorption edge, and a remote wavelength using a MAR CCD detector. Data were 
reduced using the program HKL2000 (Z. Otwinowski and W. Minor "Methods in 
Enzymology" C. W. Carter, R. M. Sweet, Eds. (Academic Press, New York, 1997), vol. 276, 
pp. 307), with further processing and scaling using the CCP4 processing package (CCP4, 
Acta. Cryst. D50, 760 (1994)). 
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The co-crystals used for the structure determination belong to the space group C2 with 
cell dimensions a = 139.08 ± 0.2 A, b - 139.08 ± 0.2 A, c = 214.49 ± 0.2 A, and beta = 89.97 
± 0.2 A. The co-crystals exhibit strong pseudo ¥4^2 symmetry. An initial solution to the 
Patterson function was produced in the tetragonal pseudo space group both automatically 
5 using the program SOLVE (T. C. Terwilliger and J. Berendzen, Acta. Cryst. D53, 571 
(1997)) and manually using the program RSPS (S. Knight, I. Andersson, C.-I. Branden, J. 
Mol. Biol. 215: 113 (1990)), and initial phases calculated using SHARP (E. de la Fortelle and 
G. Bricogne, in Methods in Enzymology C. W. Carter, R. M. Sweet, Eds. (Academic Press, 
New York, 1997), vol. 276, pp. 472)). Density modification including 4-fold non- 
10 crystallographic (NCS) averaging was done using the program DM (K. D. Cowtan, Joint 
CCP4 ESF-EACBM Newsl. Protein Crystallogr. 31: 34 (1994)). A model corresponding to 
the two copies of the co-complex in the pseudo asymmetric unit was built using O (T. A. 
Jones et al., Acta. Cryst. A47, 1 10 (1991)) modeled in 4-fold averaged electron density and 
refined against 2.5 A native data applying tight non-crystallographic restraints. The crystals 
15 are in either space group P4A2 or P4 3 , with cell dimensions a = b = 97.7 ± 0.2 angstroms and 
c = 215.9 ± 0.2 angstroms. Bulk solvent correction, positional, simulated annealing, and 
isotropic temperature factor refinement has been carried out using X-PLOR (A. T. Briinger, 
X-PLOR Manual (Version 3.1): A system for X-ray crystallography andNMR (Yale 
University Press, New Haven, CT, 1993)) and REFMAC (G. N. Murshudov, A. A. Vagin, E. 
20 J. Dodson, Acta. Cryst. D53, 240 (1997)) with tight NCS restraints against a 2.5 A native data 
set collected at Max II/BL711 in Lund. The current R-factor and R-free (on 5% of the data) 
are 24.0% and 26.8%, respectively. The r.m.s. deviations from ideal bond length and angle 
values are 0.016 A and 3.3°, respectively. No residues are found in disallowed regions of the 
Ramachandran plot. The coordinates have been deposited at the Research Collabortory for 
25 Structural Bioinformatics Protein Data Bank (code 1 QUN). 
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with the Gl strand of FimC and has its COOH-terminal carboxyl group anchored in 
the crevice of the chaperone cleft through hydrogen bonding with the conserved residues 
Arg 8C and Lys 112C in FimC (Figure 9 A). 

The G x beta-strand of the FimC chaperone contains a conserved motif of solvent 
exposed hydrophobic residues at positions 103, 105, and 107. In the FimC-FimH co- 
complex, these residues are used to complete the unfinished hydrophobic core of FimH 
(Figure 10C). The two residues Leu 103C and Leu 105C are deeply buried in the crevice created in 
the FimH pilin domain due to the missing seventh strand. Ile 107C is somewhat closer to the 
domain surface but makes van der Waals contacts with residues Val 163H and Phe 276H . Leu 103C 
contacts residues Ile 181H , Val 223H , Leu 225H and Ile 272H . Leu 105C is in contact with Ile 18lH , Leu 183H , 
Leu 252H , Ile 272H and Val 274H . This mode of binding is called "donor strand complementation" 
to emphasize the fact that the pilin domain is incomplete and that the chaperone donates its 
Gl beta-strand to complete the fold of the pilin. 

Example 5: Subunit-subunit interactions in T vpe 1 Pili 

Genetic, biochemical and electron microscopic studies have demonstrated that 
residues in the two conserved motifs (the COOH-terminal F strand and an NH 2 -terminal 
motif) participate in subunit-subunit interactions necessary for pilus assembly. See G.E. Soto 
et al., EMBO J., 17: 6155 (1998). An alignment of the pilin sequences, based on the FimC- 
FimH co-crystal structure, revealed that the NH 2 -terminal motif was part of a 10-20 residue 
NH 2 -terminal extension that was missing in the FimH pilin domain (Figure 8A). This region 
contains a highly conserved pattern of alternating hydrophobic residues (highlighted in Figure 
8A) similar to the donor G, beta-strand of the chaperone. This motif is structurally analogous 
to the Gl donor strand motif of the chaperone and molecular modeling indicates that it would 
be able to fit into the same groove occupied by the donor G l beta-strand of the chaperone. 

The type 1 pilus is a right handed helix with about 3 subunits per turn, a diameter of 
approximately 70 A, a central pore of about 20-25 A, and a rise per subunit of about 8 A. In 
order to obtain this structure, insertion of the NH 2 -terminal extension must be antiparallel to 
strand F in contrast to the parallel insertion observed for the G l beta-strand of the chaperone. 
Insertion in a parallel orientation would lead to rosette-like structures. One edge of the pilin 
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groove is lined by the COOH terminal F strand which has been shown to form a critical part 
of the subunit tail. Thus, the NH 2 -terminal extension represents the head of a subunit and 
during pilus biogenesis, it would displace the donor Gj beta-strand of the chaperone to fit into 
the tail groove of a neighboring subunit and to complete the pilin fold of its neighbor in a 

5 donor strand complementation mechanism. 

Using the FimH pilin domain as a model for FimA, applicants constructed a model for 
the type 1 pilus that fit these data (Figure 1 1). Each subunit was aligned to have its cleft 
facing towards the center of the pilus so that the height from the top to the bottom of the 
domain along the helix axis was approximately 25 A. Applying a rotation of 1 15 degrees and 

10 a rise per subunit of 8 A, a hollow helical cylinder is created. The outer diameter of this 

cylinder as measured across C a atoms is 70 A, and the inner diameter is 25 A. FimA subunits 
from different strains of E. coli exhibit considerable allelic variation. The vast majority of the 
variable positions are on the outside surface of the pilus model proposed above (Figure 11) 
which would account for the antigenic variability of type 1 pili. 

15 The proposed head- to-tail interaction between subunits in a pilus is reminiscent of 

oligomerization through three-dimensional domain swapping in the sense that a part of the 
molecule is used to complement another. However, in this case, complementation occurs not 
only between identical protein chains (FimA in the pilus rod) but also between homologous 
but distinct chains e.g., FimG, FimF and FimH in the pilus tip. Furthermore, because 

20 individual pilins promoters do not exist as stable monomers, there is no exchange of 

structural units between a monomelic and an oligomoeric state. Instead, a different protein, 
the periplasmic chaperone, is needed to keep the monomeric subunits in solution by donating 
a unique part of its structure (the G x beta-strand) to the different subunit grooves. 

Based on the structure of the FimC-FimH co-complex, pilins are missing the 

25 necessary steric information needed to fold into a native three dimensional structure. The 
information that is missing consists of the seventh edge strand of an immunoglobulin fold. 
This strand, which is necessary for folding, is donated to the hydrophobic core of the pilin by 
the periplasmic chaperone in a donor strand complementation mechanism. Thus, the steric 
information necessary for newly synthesized protein chains to fold correctly is not inherent in 
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the sequence of the protein to be folded; however, such information is instead transferred 
from another protein, the periplasmic chaperone. 

Example 6: FimH Binding to FimC and FimG by ELISA Assay 
5 The ability of FimH to bind to peptides corresponding to the Gj beta-strand of FimC 

and the N-terminal extension of FimG was tested using an ELISA assay. During pilus 
assembly, the G, beta-strand of FimC completes the Ig fold of the FimH pilin domain in the 
periplasm and then in the pilus the N-terminal extension of FimG completes the Ig fold of the 
FimH pilin domain. 

10 In order to assess the ability of FimH to bind to the two peptides, FimH was purified 

from the FimC-FimH co-complex. Synthetic peptides were synthesized corresponding to the 
G t beta-strand of FimC and the N-terminal extension of FimG. The synthesized peptide 
sequences are as follows: FimC peptide, NTLQLAHSR (SEQ ID NO: 55) and FimG peptide, 
DVTITVNGK (SEQ ID NO: 56). Stock solutions of the peptides (5 mg/ml) were dissolved 

15 inDMSO. 

The peptides were diluted in phosphate buffered saline (PBS) (120 mM NaCl, 2.7 mM 
KC1, lOmM, 10 mM PBS, pH 7.4) to 2 nmol/50ul. FimC protein was diluted to 0.1 
nmol/50ul and coated overnight onto microtiter wells with 50 u.l/well at 4°C. The ELISA 
assay was carried out as described in Kuehn et al., 1993 and Hung et al., 1996. Briefly, the 

20 wells were washed three times with PBS and blocked with 3% Bovine Serum Albumin 

(BSA) in PBS for two hours at 25°C. Then the wells were washed three times with PBS. The 
FimC-FimH co-complex was incubated in 3 M urea to separate the two proteins. Pure FimH 
in 3 M urea was collected from the flow through of a Source 15S column (Pharmacia). See 
Barnhart et al., PNAS USA 97: 7709-7714 (2000). The wells were incubated with 50ul of 

25 FimH in 3% BSA-PBS diluted to 5-25 pmoVwell FimH for 45 minutes at 25°C. The wells 
were washed 3 times with PBS followed by incubation with a 1 : 1000 dilution of mouse anti- 
FimH antibodies in 3% BSA-PBS for 45 minutes at 25°C. The wells were washed 3 times 
with PBS followed by incubation with a 1 : 1000 dilution of goat antiserum to mouse IgG 
(Sigma) conjugated to alkaline phosphatase diluted in 3% BSA-PBS for 45 minutes at 25°C. 

30 The wells were washed 3 times with PBS and washed 3 times with developing buffer (10 mM 
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diethanolamine, 0.4 mM MgCl 2 ). The ELISA was developed by adding 50(0,1 of substrate 
(50(0.1 of filtered 1 mg/ml p-nitrophenyl phosphate; Sigma) in developing buffer. The reaction 
was incubated for 1 hour at 25°C in the dark and the absorbance at 405 nm was read. 

The competition assays were carried out similarly. FimC was coated onto microtiter 

5 wells at 0.1 nmol/well. FimH at 5 pmol/well in 3% BSA-PBS was added to the FimC coated 
wells in the presence or the absence of the FimC or FimG peptide at 2 nmol/well or the 
indicated peptide concentration. Further, increasing concentrations of FimH were incubated 
with constant concentrations of the FimC or FimG peptides or the FimC protein immobilized 
on microtiter wells. FimH bound well to both pure FimC protein immobilized on microtiter 

10 wells (Fig. 12) and to the peptides corresponding to the Gj beta-strand of FimC and the N- 
terminal extension of FimG (Figure 12). Next, the ability of the peptides to inhibit FimH 
binding to FimC was tested. FimH was added to the FimC coated wells in the presence or , 
absence of peptides to FimC or FimG. Increasing concentrations of the FimC peptide further 
decreased the ability of FimH to bind to FimC immobilized on microtiter wells (Fig. 13). 

15 The FimC peptide inhibited the ability of FimH to bind to FimC immobilized on the 

microtiter wells (Fig. 14); however, the FimG peptide at the tested concentration did not 
inhibit the ability of FimH to bind to FimC (Fig. 14). 

Other features, objects and advantages of the present invention will be apparent to 
20 those skilled in the art. The explanations and illustrations presented herein are intended to 
acquaint others skilled in the art with the invention, its principles, and its practical 
application. Those skilled in the art may adapt and apply the invention in its numerous 
forms, as may be best suited to the requirements of a particular use. Accordingly, the specific 
embodiments of the present invention as set forth are not intended as being exhaustive or 
25 limiting of the present invention. 
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We claim: 

1 . An isolated compound which binds to a pilus subunit groove thereby inhibiting pilus 
assembly. 

2. The compound of claim 1 wherein the compound is a peptide. 

3. The compound of claim 1 wherein the compound is a non-peptide compound. 

4. The compound of claim 1 further comprising a mimic of a chaperone G l beta-strand 
with at least two alternating hydrophobic amino acid residues which exhibits 
antibacterial activity against a Gram-negative bacterium. 

5. The compound of claim 4 wherein said mimic further comprises the amino acid 
sequence NVLQIAL (SEQ ID NO: 1) or an analogue thereof. 

6. The compound of claim 4 wherein the mimic has been modified to improve binding, 
specificity, solubility, safety, or efficacy. 

7. The mimic of claim 4 wherein said mimic exhibits antibacterial activity against a 
Gram-negative bacterium comprising Escherichia coli, Haemophilus influenzae, 
Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, Yersinia pestis, 
Yersinia enterocolitica, Helicobacter pylori and Klebsiella pneumoniae. 

8. The compound of claim 1 further comprising a mimic of an amino-terminal motif of a 
pilus subunit with at least two alternating hydrophobic amino acid residues which 
mimic exhibits antibacterial activity against a Gram-negative bacterium. 
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The compound of claim 8 wherein said mimic of an amino-terminal motif of a pilus 
subunit further comprises the amino acid sequence SDVAFRGNLL (SEQ ID NO: 12) 
or an analogue thereof. 

The compound of claim 9 wherein said mimic has been modified to improve binding, 
specificity, solubility, safety, or efficacy. 

The compound of claim 8 wherein said mimic exhibits antibacterial activity against a 
Gram-negative bacterium comprising Escherichia coli, Haemophilus influenzae, 
Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, Yersinia pestis, 
Yersinia enterocolitica, Helicobacter pylori and Klebsiella pneumoniae. 

The compound of claim 1 which is a 10 to 20 residue peptide or peptide analog 
according to formula (I): 

(I) Z !~Z 2 — Xj— X 2 — X 3 — X 4 — X 5 — X 6 — X 7 — X g — X 9 — X 10 — Z 3 ~Z 4 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z x is R-C(0)-NR- or RRN-; 

Z 2 is an optional 1 to 5 residue peptide or peptide analog; 
Xj is any amino acid residue; 
X 2 is any amino acid residue; 

X 3 is a hydrophobic residue or a hydroxyl-substituted aliphatic residue; 

X 4 is any amino acid residue; 

X 5 is a hydrophobic residue or Gly; 

X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is Gly, an amide-substituted polar residue or a hydrophobic residue; 
X g is any amino acid residue; 
X 9 is an aliphatic residue; 
X 10 is any amino acid residue; 
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Z 3 is an optional 1 to 5 residue peptide or peptide analog; 
Z 4 is -C(0)OR or -C(0)NRR; 

each R is independently hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl or (C 6 -C 14 ) aryl; 

each "-" between residues X 1 through X 10 , Z 2 and X! and X 10 and Z 3 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage; and 

each "~" represents a bond 

The compound of claim 12 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each "-" between residues Xj through X 10 , Z 2 and X x and X 10 and Z 3 is an 
amide linkage; 

Z, is H 2 N-; 

Z 4 is -C(0)OH or a salt thereof; 
optional Z 2 is not present; 
optional Z 3 is not present; 
Xj is other than a basic residue; 
X 2 is other than an aliphatic residue; 
X 3 is an aliphatic residue or T; 
X 4 is other than an acidic residue; 
X 5 is an aliphatic residue, F or G; 
X 7 is G, N or A; 

X g is other than an aliphatic residue; and 
X 10 is an aliphatic or a polar residue. 

The compound of claim 13 which is selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 
7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1, SEQ ID NO: 12, 
SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
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SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, 
SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, 
SEQ ID NO: 28 and SEQ ID NO: 29. 

The compound of claim 12 wherein said compound exhibits antibacterial activity 
against a Gram-negative bacterium comprising E. colt, H. influenzae, S. euteriditis, S. 
typhimurium, B. pertussis, Y. pestis, Y. entarocolitica, H. pylori andK. pneumoniae. 

The antibacterial compound of claim 1 which is a 7 to 17 residue peptide or peptide 
analog according to formula (II): 

(II) Z n ~Z 12 — X u — X 12 — X 13 — X 14 — X 15 — X 16 — X 17 — Z 13 ~Z 14 
or a pharmaceutically-acceptable salt thereof, wherein: 
Z n is R'-C(0)-NR'- or R'R'N— ; 

Z 12 is an optional 1 to 5 residue peptide or peptide analog; 

X u is any amino acid residue; 

X 12 is any amino acid residue; 

X, 3 is a hydrophobic residue; 

X i4 is any amino acid residue; 

X 15 is a hydrophobic residue; 

X 16 is any amino acid residue; 

X 17 is hydrophobic residue or a hydroxyl- substituted aliphatic residue; 
Z 13 is an optional 1 to 5 residue peptide or peptide analog; 
Z 14 is -C(0)OR' or -C(0)NR*R'; 

each R' is independently hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl or (C 6 -C 14 ) aryl; 

each "-" between residues X u through X n , Z 12 and X u and X 1V and Z 13 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage; and 

each "~" independently represents a bond. 
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The compound of claim 16 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each "-" between residues X n through X I7 , Z 12 and X n and X 17 and Z 13 is an 
amide linkage; 

Z u is H 2 N-; 

Z 14 is -C(0)OH or a salt thereof; 
optional Z 12 is not present; 
optional Z 13 is not present; 
X n is other than a basic residue; 
X n is an aliphatic residue or M; 
X 14 is other than an aromatic residue; 
X 15 is an aliphatic residue, F or M; and 

X 17 is an aliphatic residue, F, M or a hydroxyl-substituted aliphatic residue. 

The compound of claim 17 which is selected from the group consisting of SEQ ID 
NO: 1, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ED NO: 33, SEQ ID 
NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID 
NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID 
NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID 
NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 and SEQ ID NO: 52. 

The compound of claim 16 wherein said compound exhibits antibacterial activity 
against a Gram-negative bacterium comprising E. coli, H. influenzae, S. euteriditis, S. 
typhimurium, B. pertussis, Y. pestis, Y. entarocolitica, H. pylori and K. pneumoniae. 

A mannose analogue capable of competitively binding the amino terminal mannose- 
binding domain of a Gram-negative bacterial adhesin. 

The analogue of claim 20 wherein said compound exhibits antibacterial activity 
against a Gram-negative bacterium comprising Escherichia coli, Haemophilus 
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influenzae, Salmonella enteriditis, Salmonella typhimurium, Bordetella pertussis, 
Yersinia pestis, Yersinia enterocolitica, Helicobacter pylori and Klebsiella 
pneumoniae. 

A composition containing an effective amount of a mimic of an amino-terminal motif 
of a pilus subunit with at least two alternating hydrophobic amino acid residues which 
mimic exhibits antibacterial activity against a Gram-negative bacterium and a 
pharmaceutically acceptable carrier, excipient or diluent. 

The composition of claim 22 wherein said mimic exhibits antibacterial activity by 
competitively binding to a pilus subunit groove and thereby inhibits pilus assembly. 

The composition of claim 23 which is a 10 to 20 residue peptide or peptide analog 
according to formula (I): 

(I) Z 1 ~Z 2 -X 1 -X 2 -X 3 -X 4 -X 5 -X 6 -X 7 -X 8 -X 9 -X 10 -Z 3 ~Z 4 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z t is R-C(0)-NR- or RRN-; 

Z 2 is an optional 1 to 5 residue peptide or peptide analog; 
X x is any amino acid residue; 
X 2 is any amino acid residue; 

X 3 is a hydrophobic residue or a hydroxyl-substituted aliphatic residue; 

X 4 is any amino acid residue; 

X 5 is a hydrophobic residue or Gly; 

X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is Gly, an amide-substituted polar residue or a hydrophobic residue; 
X 8 is any amino acid residue; 
X 9 is an aliphatic residue; 
X 10 is any amino acid residue; 
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Z 3 is an optional 1 to 5 residue peptide or peptide analog; 
Z 4 is -C(0)OR or -C(0)NRR; 

each R is independently hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl or (C 6 -C 14 ) aryl; 

each "-" between residues X { through X 10 , Z 2 and X x and X 10 and Z 3 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage and 

each "~" represents a bond. 

The composition of claim 24 wherein the mimic further comprises one or more 
features selected from the group consisting of: 

each "-" between residues Xj through X 10 , Z 2 and X, and X 10 and Z 3 is an 
amide linkage; 

Z x is H 2 N-; 

Z 4 is -C(0)OH or a salt thereof; 
optional Z 2 is not present; 
optional Z 3 is not present; 
X t is other than a basic residue; 
X 2 is other than an aliphatic residue; 
X 3 is an aliphatic residue or T; 
X 4 is other than an acidic residue; 
X 5 is an aliphatic residue, F or G; 
X v is G, N or A; 

X 8 is other than an aliphatic residue; and 
X 10 is an aliphatic or a polar residue. 

The composition of claim 25 wherein said compound is selected from the group 
consisting of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 
11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 
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16, SEQ TD NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 
21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 
26, SEQ ID NO: 27, SEQ ID NO: 28 and SEQ ID NO: 29. 

A composition containing an effective amount of a mimic of a chaperone G x beta- 
strand with at least two alternating hydrophobic amino acid residues which mimic 
exhibits antibacterial activity against a Gram-negative bacterium and a 
pharmaceutically acceptable carrier, excipient or diluent. 

The composition of claim 27 wherein said mimic exhibits antibacterial activity by 
competitively binding to a pilus subunit groove and thereby inhibits pilus assembly. 

The composition of claim 28 which is a 7 to 17 residue peptide or peptide analog 
according to formula (II): 

(II) z n ~z 12 -x n -x 12 -x 13 -x 14 -x 15 -x 16 -x ]7 -z 13 ~z 14 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z n is R'-CCOV-NR'- or R'R'N-; 

Z 12 is an optional 1 to 5 residue peptide or peptide analog; 

X u is any amino acid residue; 

X 12 is any amino acid residue; 

X 13 is a hydrophobic residue; 

X 14 is any amino acid residue; 

X 1S is a hydrophobic residue; 

Xj 6 is any amino acid residue; 

X 17 is hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
Z 13 is an optional 1 to 5 residue peptide or peptide analog; 
Z 14 is -C(0)OR' or -C(0)NR'R'; 

each R' is independently hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, 
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(C 2 -C 6 ) alkynyl or (C 6 -C 14 ) aryl; 

each "-" between residues X n through X 17 , Z I2 and X u and X 17 and Z 13 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage; and 

each "~" independently represents a bond. 

The composition of claim 29 wherein said mimic further comprises one or more 
features selected from the group consisting of: 

each "-" between residues X u through X 17 , Z 12 and X„ and X l7 and Z u is an 
amide linkage; 

Z n is H 2 N-; 

Z 14 is -C(0)OH or a salt thereof; 
optional Z 12 is not present; 
optional Z 13 is not present; 
X n is other than a basic residue; 
X 13 is an aliphatic residue or M; 
X 14 is other than an aromatic residue; 
X 15 is an aliphatic residue, F or M; and 

X, 7 is an aliphatic residue, F, M or a hydroxyl-substituted aliphatic residue. 

The composition of claim 30 wherein said mimic is selected from the group consisting 
of SEQ ID NO: 1, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 
33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 
38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 
43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 
48, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 and SEQ ID NO: 52. 

The composition of claim 31 wherein the mimic comprises the amino acid sequence 
NVLQIAL (SEQ ID NO: 1) or an analogue thereof. 
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33 . A composition containing an effective amount of a mannose analogue capable of 
competitively binding the amino terminal mannose-binding domain of a Gram- 
negative bacterial adhesin and a pharmaceutically acceptable carrier, excipient or 
diluent. 

34. A method of preventing or inhibiting formation of a pilus subunit-subunit structure in 
a subject, said method comprising administering a mimic of an N-terminal motif of a 
pilus subunit with at least two alternating hydrophobic amino acid residues which 
mimic exhibits antibacterial activity against a Gram-negative bacterium. 

35. The method of claim 34 wherein said mimic exhibits antibacterial activity by 
competitively binding to a pilus subunit groove and thereby inhibits pilus assembly. 

36. The method of claim 35 wherein said mimic comprises the amino acid sequence 
SDVAFRGNLL (SEQ ID NO: 12) or an analogue thereof. 

37 . The method of claim 36 wherein said subj ect is a mammal. 

38. The method of claim 36 wherein said subject is a plant. 

39. A method of preventing or inhibiting formation of a chaperone-subunit structure in a 
subject, said method comprising administering a mimic of a chaperone Gj beta strand 
with least two alternating hydrophobic amino acid residues which mimic exhibits 
antibacterial activity against a Gram-negative bacterium. 

40. The method of claim 39 wherein said mimic exhibits antibacterial activity by 
competitively binding to a pilus subunit groove and thereby inhibits pilus assembly. 

4 1 . The method of claim 40 wherein said mimic comprises the amino acid sequence 
NVLQIAL (SEQ ID NO: 1) or an analogue thereof. 
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A method of preventing or inhibiting pili adhesion to a host tissue, said method 
comprising administering a mannose analogue capable of competitively binding the 
amino terminal mannose-binding domain of a Gram-negative bacterial adhesin. 

A method of treating a bacterial infection comprising administering to a subject in 
need thereof an effective amount of a compound which is a 10 to 20 residue peptide or 
peptide analog according to formula (I): 

(I) Z j ~Z 2 — Xj— X 2 — X 3 — X 4 — X 5 — X 6 — X 7 — X 8 — X 9 — X j 0 — Z 3 ~Z 4 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z x is R-C(0)-NR- or RRN-; 

Z 2 is an optional 1 to 5 residue peptide or peptide analog; 
Xj is any amino acid residue; 
X 2 is any amino acid residue; 

X 3 is a hydrophobic residue or a hydroxyl-substituted aliphatic residue; 

X 4 is any amino acid residue; 

X 5 is a hydrophobic residue or Gly; 

X 6 is a hydrophobic or a hydrophilic residue; 

X 7 is Gly, an amide-substituted polar residue or a hydophobic residue; 
X g is any amino acid residue; 
X 9 is an aliphatic residue; 
X 10 is any amino acid residue; 

Z 3 is an optional 1 to 5 residue peptide or peptide analog; 
Z 4 is -C(0)OR or -C(0)NRR; 

each R is independently hydrogen, (C r C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl or (C 6 -C 14 ) aryl; 

each between residues X 1 through X 10 , Z 2 and X : and X 10 and Z 3 
independently represents an amide linkage, a substituted amide linkage or an isostere 
of an amide likage; and 
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each "~" represents a bond. 

The method of claim 43 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each "-" between residues X x through X 10 , Z 2 and X t and X 10 and Z 3 is an 
amide linkage; 

Z, is H 2 N-; 

Z 4 is -C(0)OH or a salt thereof; 
optional Z 2 is not present; 
optional Z 3 is not present; 
X t is other than a basic residue; 
X 2 is other than an aliphatic residue; 
X 3 is an aliphatic residue or T; 
X 4 is other than an acidic residue; 
X 5 is an aliphatic residue, F or G; 
X 7 is G, N or A; 

X s is other than an aliphatic residue; and 
X 10 is an aliphatic or a polar residue. 

The method of claim 44 wherein said compound is selected from the group consisting 
SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ 
ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID 
NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID 
NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID 
NO: 27, SEQ ID NO: 28 and SEQ ID NO: 29. 

The method of claim 45 wherein said mimic comprises the amino acid sequence 
SDVAFRGNLL (SEQ ID NO: 12) or an analogue thereof. 
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The method of claim 42 wherein the infection is caused by E. coli, H. influenzae, S. 
euteriditis, S. typhimurium, B. pertussis, Y. pestis, Y. entarocolitica, H. pylori and K. 
pneumoniae. 

The method of claim 42 wherein the subject is a mammal or human. 
The method of claim 42 wherein the subject is a plant. 

A method of treating abacterial infection comprising administering to a subject in 
need thereof an effective amount of a compound which is a 7 to 17 residue peptide or 
peptide analog according to formula (II): 

(II) Zj j~Z 12 — Xj — X 12 — X 13 — X 14 — X 15 — X 16 — X 17 - Z 13 ~Z 14 

or a pharmaceutically-acceptable salt thereof, wherein: 
Z n is R'-C(0)-NR*- or R'R'N-; 

Z 12 is an optional 1 to 5 residue peptide or peptide analog; 

X n is any amino acid residue; 

X 12 is any amino acid residue; 

X 13 is a hydrophobic residue; 

X 14 is a hydrophobic or a hydrophilic residue; 

X 15 is a hydrophobic residue; 

X 16 is any amino acid residue; 

X 17 is hydrophobic residue or a hydroxyl-substituted aliphatic residue; 
Z 13 is an optional 1 to 5 residue peptide or peptide analog; 
Z 14 is -C(0)OR' or -C(0)NR'R'; 

each R' is independently hydrogen, (C,-C 6 ) alkyl, (C 2 -C 6 ) alkenyl, (C 2 -C 6 ) 
alkynyl or (C 5 -C I4 ) aryl; 

each "-" between residues X n through X 17 , Z 12 and X n and X n and Z 13 
independently represents an amide linkage, a substituted amide linkage or an isostere 
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of an amide likage; and 

each "~" independently represents a bond. 

The method of claim 50 wherein said compound further comprises one or more 
features selected from the group consisting of: 

each "-" between residues X n through X 17 , Z 12 and X u and X 17 and Z I3 is an 
amide linkage; 

Z u is H 2 N-; 

Z 14 is -C(0)OH or a salt thereof; 
optional Z 13 is not present; 
optional Z 14 is not present; 
X n is other than a basic residue; 
X 13 is an aliphatic residue or M; 
X 14 is other than an aromatic residue; 
X 15 is an aliphatic residue, F or M; and 

X 17 is an aliphatic residue, F, M or a hydroxyl-substituted aliphatic residue. 

The method of claim 5 1 wherein said compound is selected from the group consisting 
SEQ ID NO: 1, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ED NO: 32, SEQ ID NO: 33, 
SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, 
SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, 
SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ED NO: 47, SEQ ID NO: 48, 
SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51 and SEQ ID NO: 52. 

The method of claim 50 wherein the bacterial infection is caused by E. coli, H. 
influenzae, S. euteriditis, S. typhimurium, B. pertussis, Y. pestis, Y. entarocolitica, H. 
pylori andK. pneumoniae. 



The method of claim 50 wherein the subject is a mammal or a human. 
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55 . The method of claim 50 wherein the subj ect is a plant. 

56. A method of preventing or inhibiting biofilm formation, said method comprising 
administering an effective amount of an isolated compound which bends to a pilus 
subunit groove to an environment or surface containing Gram-negative bacteria. 

57. The method of claim 56, wherein said compound further comprises a mimic of a 
chaperone G r beta strand with at least two alternating hydrophobic amino acid 
residues which mimic exhibits antibacterial activity against a Gram-negative 
bacterium. 

58. The method of claim 57 wherein the mimic comprises the amino acid sequence 
NVLQIAL (SEQ ED NO: 1) or an analogue thereof. 

59. The method of claim 56, wherein said compound further comprises a mimic of an 
amino-terminal motif of a pilus subunit with at least two alternating hydrophobic 
amino acid residues which mimic exhibits antibacterial activity. 

60. The method of claim 56, wherein said compound further comprises a mannose 
analogue capable of competitively binding the amino terminal mannose-binding 
domain of a Gram-negative bacterial adhesin. 

61. A method for inhibiting bacterial colonization by a Gram-negative organism, said 
method comprising administering an effective amount of an isolated compound which 
binds to a pilus subunit groove to an environment or surface containing Gram- 
negative bacteria. 

62. The method of claim 6 1 , wherein said compound further comprises a mimic of a 
chaperone G 1 beta strand with at least two alternating hydrophobic amino acid 
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residues which mimic exhibits antibacterial activity against a Gram-negative 
bacterium. 

The method of claim 61 wherein the mimic comprises the amino acid sequence 
NVLQIAL (SEQ ID NO: 1) or an analogue thereof. 

The method of claim 61, wherein said compound further comprises a mimic of an 
amino-terminal motif of a pilus subunit with at least two alternating hydrophobic 
amino acid residues which exhibits antibacterial activity against a Gram-negative 
bacterium. 

The method of claim 61, wherein said compound further comprises a mannose 
analogue capable of competitively binding the amino-terminal mannose-binding 
domain of a Gram-negative bacterial adhesin. 

A composition comprising a pilus chaperone-subunit co-complex in crystalline form, 
wherein said co-complex comprises an amino acid sequence of a G, beta-strand of a 
chaperone and an amino acid sequence of an amino-terminal end of a pilus subunit. 

The composition of claim 66 wherein said amino acid sequence of the G! beta-strand 
of the chaperone is derived from a Nl 01 to LI 07 amino acid region of the G : beta- 
strand of a chaperone. 

The composition of claim 67 wherein the amino acid sequence derived from a G t 
beta-strand of a chaperone is SEQ ID NO: 1. 

The composition of claim 67 wherein the amino acid sequence derived from an amino 
acid sequence of an amino-terminal end of a pilus subunit is SEQ ID NO: 12. 
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The composition of claim 66 wherein the pilus chaperone-subunit co-complex in 
crystalline form is a PapD-PapK chaperone-subunit co-complex. 

The composition of claim 71 wherein the crystal has a space group of P2 1 2i2 1 with 
unit cell dimensions of a = 62.1 ± 0.2 angstroms, b = 63.6 ± 0.2 angstroms and c = 
92.7 ± 0.2 angstroms. 

The composition of claim 71, wherein said crystal is of diffraction quality. 

The composition of claim 71, wherein said crystal is a native crystal. 

The composition of claim 71, wherein said crystal is a heavy-atom derivative crystal. 

The composition of claim 71, wherein at least one of PapD or PapK of the PapD- 
PapK chaperone-subunit co-complex is a mutant. 

The crystal of claim 75, wherein the mutant is a selenomethionine or selenocysteine 
mutant. 

The crystal of claim 75, wherein the mutant is a conservative mutant. 

The crystal of claim 75, wherein the mutant is a truncated or extended mutant. 

The composition of claim 66, wherein said crystal is produced by a method 
comprising the steps of: 

(a) mixing a volume of a solution comprising the PapD-PapK chaperone- 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
in a closed container, under conditions suitable for crystallization until 
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the crystal forms. 

A method of crystallizing a PapD-PapK chaperone-subunit co-complex, said method 
comprising: 

(a) mixing a volume of a solution comprising the PapD-PapK chaperone 
subunit co-complex with a volume of a reservoir solution comprising a 
precipitant; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution 
in a closed container, under conditions suitable for crystallization until 
the crystal forms. 

A method of identifying an antibacterial compound, comprising the step of using a 
three-dimensional structural representation of a pilus chaperone-subunit co-complex, 
or a fragment thereof comprising a Gj beta- strand binding cleft, to computationally 
screen a candidate compound for an ability to bind the G l beta-strand binding cleft of 
the pilus subunit. 

The method of claim 81 further comprising the steps of: 
synthesizing the candidate compound; and 
screening the candidate compound for antibacterial activity. 

The method of claim 81 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a PapK subunit. 

The method of claim 83 wherein the three-dimensional structural information further 
comprises the atomic structure coordinates of residues comprising the Gj beta-strand 
binding cleft of a PapK subunit. 



85. 



The method of claim 84 wherein the atomic structure coordinates of residues 
comprising the Gj beta-strand binding cleft of the PapK subunit are obtained from the 
atomic structure coordinates of a PapD-PapK chaperone subunit co-complex. 
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The method of claim 85 wherein the PapD-PapK co-complex atomic structure 
coordinates are those coordinates deposited at the Protein Data Bank under entry code 
1PDK. 

The method of claim 81 wherein the structural information comprises the atomic 
structure coordinates of a FimH subunit. 

The method of claim 87 wherein the structural information further comprises the 
atomic structure coordinates of residues comprising a Gj beta-strand binding cleft of a 
FimH subunit. 

The method of claim 88 wherein the atomic structure coordinates are obtained from 
the atomic structure coordinates of a FimC-FimH chaperone-adhesin co-complex. 

The method of claim 89 wherein the atomic structure coordinates are those 
coordinates deposited at the Research Collaboratory for Structural Bioinformatics 
Protein Data Bank under entry code 1QUN. 

A method of identifying an antibacterial compound comprising the step of using a 
three-dimensional structural representation of a pilus chaperone-subunit co-complex, 
or a fragment thereof comprising a Gj beta-strand binding cleft, to computationally 
design a synthesizable candidate compound that binds the G! beta-strand binding cleft 
of a pilus subunit. 

The method of claim 91 wherein the computational design comprises the steps of: 
identifying chemical entities or fragments capable of associating with the Gj 

beta-strand binding cleft of the pilus subunit; and 

assembling the chemical entities or fragments into a single molecule to 

provide the structure of the candidate compound. 
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93. The method of claim 92 further including the steps of: 

synthesizing the candidate compound; and 

screening the candidate compound for antibacterial activity. 

94. The method of claim 93 wherein the structural information comprises the atomic 
structure coordinates of aPapK subunit. 

95 . The method of claim 94 wherein the structural information further comprises the 
atomic structure coordinates of residues comprising the G! beta-strand binding cleft of 
a PapK subunit. 

96. The method of claim 95 wherein the atomic structure coordinates are obtained from 
the atomic structure coordinates of aPapD-PapK chaperone-subunit co-complex. 

97. The method of claim 96 wherein the PapD-PapK co-complex atomic structure 
coordinates are those coordinates deposited at the Protein Data Bank under entry code 
1PDK. 

98. The method of claim 93 wherein the structural information comprises the atomic 
structure coordinates of a FimH subunit. 

99. The method of claim 98 wherein the structural information comprises the atomic 
structure coordinates of residues comprising a G x beta-strand binding cleft of a FimH 
subunit. 

1 00. The method of claim 99 wherein the atomic structure coordinates are obtained from 
the atomic structure coordinates of a FimC-FimH chaperone-adhesin co-complex. 

101. The method of claim 100 wherein the atomic structure coordinates are those 
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coordinates deposited at the Research Collaboratory for Structural Bioinformatics 
Protein Data Bank under entry code 1QUN. 

1 02. A method of identifying a compound having antibacterial activity, comprising the step 
of using a three-dimensional structural representation of a pilus chaperone, or a 
fragment thereof comprising a G t beta-strand, to identify or design a compound 
having a three-dimensional structure similar to the three-dimensional structure of the 

5 G! beta-strand of the pilus chaperone. 

103. The method of claim 102 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of residues comprising a G, beta-strand of 
a PapD chaperone or a FimC chaperone. 

1 04. The method of claim 103 wherein the three dimensional structural information 
comprises the atomic structure coordinates of a PapD chaperone. 

1 05 . The method of claim 1 04 wherein the atomic structure coordinates of the PapD 
chaperone are obtained from the atomic structure coordinates of a PapD-PapK 
chaperone-subunit co-complex. 

106. The method of claim 105 wherein the atomic structure coordinates of the PapD-PapK 
chaperone-subunit co-complex are those deposited at the Protein Data Bank under 
entry code 1PDK. 

1 07. The method of claim 103 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a FimC chaperone. 

1 08. The method of claim 107 wherein the atomic structure coordinates of the FimC 
chaperone are obtained from the atomic structure coordinates of a FimC-FimH 
chaperone-adhesin co-complex. 
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109. The method of claim 108 wherein the structure coordinates of the FimC-FimH 
chaperone-adhesin co-complex are those deposited at the Research Collaboratory for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

110. A method of identifying an antibacterial compound, said method comprising the step 
of using a three-dimensional structural representation of an adhesin, or a fragment 
thereof comprising a lectin binding domain, or a portion thereof, to screen a candidate 
compound for the ability to bind a lectin binding domain of the adhesin. 

111. The method of claim 110, further comprising the steps of: 

synthesizing the candidate compound; and 

assaying the candidate compound for antibacterial activity. 

112. The method of claim 111, wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a FimH adhesin. 

113. The method of claim 112 wherein the three-dimensional structural information further 
comprises the atomic structure coordinates of residues comprising a lectin binding 
domain of a FimH adhesin or portion thereof. 

114. The method of claim 113 wherein the atomic structure coordinates are obtained from 
the structure coordinates of a FimC-FimH chaperone-adhesin co-complex. 

115. The method of claim 114 wherein the structure coordinates of the FimC-FimH 
chaperone adhesin co-complex are those deposited at the Research Collaboratory for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

116. A method of identifying an antibacterial compound comprising the step of using a 
three-dimensional structural representation of an adhesin, or a fragment thereof 
comprising a lectin binding domain or portion thereof, to computationally design a 
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compound that binds the lectin binding domain of the adhesin. 

117. The method of claim 116 wherein the computational design comprises the steps of: 

identifying chemical entities or fragments capable of associating with the 
lectin binding domain; and 

assembling the chemical entities or fragments into a single molecule to 
5 provide the structure of the candidate compound. 

118. The method of claim 117, further comprising the steps of: 
synthesizing the candidate compound; and 

screening the candidate compound for antibacterial activity. 

119. The method of claim 118 wherein the three-dimensional structural information 
comprises the atomic structure coordinates of a FimH adhesin. 

120. The method of claim 119 wherein the three-dimensional structural information further 
comprises the atomic structure coordinates of residues comprising a lectin binding 
domain of a FimH adhesin. 

121. The method of claim 120 wherein the atomic structure coordinates are obtained from 
the structure coordinates of a FimC-FimH chaperone-adhesin co-complex or portion 
thereof. 

122. The method of claim 121 wherein the structure coordinates of the FimC-FimH 
chaperone-adhesin co-complex are those deposited at the Research Collaboratory for 
Structural Bioinformatics Protein Data Bank under entry code 1QUN. 

1 23 . A machine-readable medium embedded with information that corresponds to a three- 
dimensional structural representation of a crystalline pilus chaperone-subunit co- 
complex or a fragment or portion thereof. 
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124. The machine-readable medium of claim 123 wherein the pilus chaperone-subunit co- 
complex is a PapD-PapK chaperone-subunit co-complex. 

125. The machine-readable medium of claim 124 wherein at least one subunit of the PapD- 
PapK co-complex is a mutant. 

126. The machine-readable medium of claim 125 wherein the mutant is a selenomethionine 
or selenocysteine mutant. 

127. The machine-readable medium of claim 125 wherein the mutant is a conservative 
mutant. 

128. The machine-readable medium of claim 124, in which the information comprises 
atomic structure coordinates, or a subset thereof. 

129. The machine-readable medium of claim 128 wherein the atomic structure coordinates 
are those deposited at the Protein Data Bank under entry code 1PDK, or a subset 
thereof. 

130. The machine-readable medium of claim 123 wherein the pilus chaperone-subunit co- 
complex is a FimC-FimH chaperone-adhesin co-complex. 

131. The machine-readable medium of claim 130 wherein at least one subunit of the FimC- 
FimH chaperone-adhesin co-complex is a mutant. 

132. The machine-readable medium of claim 131 wherein the mutant is a selenomethionine 
or selenocysteine mutant. 

133. The machine-readable medium of claim 1 3 1 wherein the mutant is a conservative 
mutant. 
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134. The machine-readable medium of claim 130, in which the information comprises 
atomic structure coordinates, or a subset thereof. 



135. The machine-readable medium of claim 134 wherein the atomic structure coordinates 
are those deposited at the Research Collaboratory for Structural Bioinformatics 
Protein Data Bank under entry code 1QUN, or a subset thereof. 
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ABSTRACT 

Many Gram-negative pathogens assemble adhesive structures on their surfaces that 
allow them to colonize host tissues and cause disease. Novel compositions for the prevention 
or inhibition of pilus assembly in Gram-negative pathogens are disclosed. Interacting with 

5 the binding site of pili subunits will negatively affect the chaperone/usher pathway which is 
one molecular mechanism by which Gram-negative bacteria assemble adhesive pili structures 
and thus prevent or inhibit pilus assembly. Additionally, novel compounds and compositions 
for interfering or preventing adhesion of piliated bacteria to host tissues are provided. Such 
compounds and compositions prevent or inhibit pili adhesion to host tissues by interacting 

10 with the mannose-binding domains on pilus adhesin subunits. Also provided are methods for 
the treatment or prevention of diseases caused by tissue-adhering pilus- forming bacteria by 
interaction with the binding between pilus subunits; the binding between pilus subunits and 
periplasmic chaperones; and the binding of a pilus adhesin to the host epithelial tissue. Also 
provided are pharmaceutical preparations capable of interacting with the binding between 

15 pilus subunits, between pilus subunits and periplasmic chaperones and between the pilus 
adhesin. 

The present invention further relates to co-crystals of pilus chaperone-subunit co- 
complexes, detailed three dimensional structural information illustrating the interaction 
between pilus subunits and/or between a pilus subunit and a chaperone for a pilus chaperone- 
20 subunit co-complex and methods of utilizing the X-ray crystallographic data from such co- 
crystals to design, identify and screen for compounds that exhibit antibacterial activity. 

The present invention also relates to machine readable media embedded with the 
three-dimensional atomic structure coordinates of pilus chaperone-subunit co-complex and 
subsets thereof. 



25 



1/32 



FIG. 1A 



2/32 




3/32 



FIG. 1C 




5/32 



CD 
rsj 




6/32 
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Attorney's Docket No. WSHU 2 005.1 
DECLARATION AND POWER OF ATTORNEY 



REGULAR OR DESIGN APPLICATION 



As a below named inventor, I hereby declare that : 

My residence, post office address and citizenship are as stated 
below next to my name . 

I believe I am the original, first and sole inventor (if only one 
name is listed below) or an original, first and joint inventor 
(if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention 
entitled : 

ANTI- BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS BIOGENESIS, 
ADHESION AND ACTIVITY, CO- CRYSTALS OF PILUS SUBUNITS AND METHODS 
OF USE THEREOF 

the specification of which: 

(check one) 

[X] is attached hereto 

[ ] was filed on as Application Serial No. 

, and was amended on . 

[ ] was described and claimed in PCT International Application 

No. , filed on and as amended 

under PCT Article 19 on , if any. 



ACKNOWLEDGEMENT OF REVIEW OF PAPERS AND DUTY OF CANDOR 

I hereby state that I have reviewed and understand the contents 
of the above identified specification, including the claims, as 
amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material 
to patentability as defined in Title 37, Code of Federal 
Regulations §1.56. 
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PRIORITY CLAIM 



I hereby claim foreign priority benefits under Title 35, United 
States Code, §119 (a) - (d) or §365 (b) of any foreign application 
for patent or inventor's certificate, or §365 (a) of any PCT 
application which designates at least one country other than the 
United States of America, listed below and have also identified 
below any foreign application for patent or inventor's 
certificate having a filing date before that of the application 
on which priority is claimed: 

Priority Claimed 



(Number) (Country) (Day/Month/Year Filed) 



(Number) (Country) (Day/Month/Year Filed) 



(Number) (Country) (Day/Month/Year Filed) 

Priority Not Claimed 

ANY FOREIGN APPLICATION (S ) , ON THE SAME SUBJECT MATTER WHICH HAS 
A FILING DATE EARLIER THAN THE EARLIEST APPLICATION FROM WHICH 
PRIORITY IS CLAIMED 



(Number) (Country) (Day/Month/Year Filed) 



CLAIM FOR BENEFIT OF PROVISIONAL APPLICATION (S) 

I hereby claim the benefit under Title 35, United States Code, 
§119 (e) of any United States provisional application (s) listed 
below. 

60/148,280 08/11/99 

(Application Number) (Filing Date) 



(Application Number) (Filing Date) 
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CLAIM FOR BENEFIT OF EARLIER U.S. APPLICATION (S) 
UNDER 35 U.S.C. 120 

(complete this part only if this is a divisional, 
continuation or CIP application) 

I hereby claim the benefit under Title 35, United States Code, 
§120 of any United States application (s) , or §365 (c) of any PCT 
international application designating the United States of 
America, listed below and, insofar as the subject matter of each 
of the claims of this application is not disclosed in the prior 
United States application in the manner provided by the first 
paragraph of Title 35, United States Code §112, I acknowledge the 
duty to disclose information which is material to patentability 
as defined in Title 37, Code of Federal Regulations, §1.56 which 
became available between the filing date of the prior application 
and the national or PCT International filing date of this 
application : 



(Serial No.) (Filing Date) (Status) 



(Serial No.) (Filing Date) (Status) 



POWER OF ATTORNEY 

I hereby appoint the following attorneys to prosecute this 
application and to transact all business in the Patent and 
Trademark Office connected therewith: Irving Powers (15,700), 
Donald G. Leavitt (17,626), John K. Roedel , Jr. (25,914), Michael 
E. Godar (28,416), Edward J. Hejlek (31,525), William E. Lahey 

(26,757), Richard G. Heywood (18,224), Frank R. Agovino (27,416), 
Kurt F. James (33,716), G. Harley Blosser (33,650), Paul I. J. 
Fleischut (35,513), Vincent M. Keil (36,838), Robert M. Evans, 
Jr. (36,794), Robert M. Bain (36,736), Joseph A. Schaper 

(30,493), Kathleen M. Petrillo (35,076), David E . Crawford, Jr. 

(38,118), Paul A. Maddock (37,877), Richard L. Bridge (40,529), 
Christopher M. Goff (41,785), James E. Butler (40,931), Derick E. 
Allen (43,468), Matthew L. Cutler (43,574), Michael G. Munsell 

(43,820), Karen Y. Hui (44,785), Anthony R. Kinney (44,834), 
Brian P. Klein (44,837), Sarah J. Chickos (46,157), Donald W. 
Tuegel (45,424), Steven M. Ritchey (46,321), Michael J. Thomas 

(39,857), and Kathryn J. Doty (40,593), all of the law firm of 
SENNIGER, POWERS, LEAVITT & ROEDEL, One Metropolitan Square, 16th 
Floor, St. Louis, Missouri 63102. 
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Send Correspondence To: 



Direct Telephone Calls To: 



Customer Number: 0 0 0321 



Karen Y. Hui 
(314) 231-5400 



I hereby declare that all statements made herein of my own 
knowledge are true and that all statements made on information 
and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful false statements may 
jeopardize the validity of the application or any patent issued 
thereon. 



Full name of sole or first inventor Scott J. Hultgren 



Inventor's signature . Date . 

Residence St. Louis, Missouri Citizenship 

Post Office address 163 7 Country H ill Lane 



St. Louis, Missouri 63021 



Full name of second joint inventor 



Frederic G. Sauer 



Second inventor's signature 



Date 



Residence 



St. Louis, Missouri 



Citizenship 



U.S 



Post Office address 



7510 Parkdale, #1E 



St. Louis, Missouri 63105 
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Full name of third joint inventor Gabriel Waksman 

Second inventor's signature Date 

Residence St. Louis, Missouri Citizenship France 

Post Office address 50 0 West Drive 

University City, MO 63130 

Full name of fourth joint inventor Klaus Fuetterer 

Second inventor's signature Date 

Residence St. Louis, Missouri Citizenship Germany 

Post Office address 454 0 Laclede Avenue, #2 01 

St. Louis, Missouri 63108 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Application of Hultgren, et al. 
Serial No.: (to be assigned) 
Filed: August 11, 2000 

For: ANTI-BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS BIOGENESIS, 
ADHESION AND ACTIVITY, CO-CRYSTALS OF PILUS SUBUNITS AND METHODS OF 
USE THEREOF 
Our File: WSHU 2005.1 

August 11,2000 



STATEMENT UNDER 37 C.F.R. 1.821(0 

TO THE COMMISSIONER OF PATENTS AND TRADEMARKS 
Sir: 

In accordance with 37 C.F.R. 1.821(f), I hereby state that the information recorded in 
computer readable form is identical to the written sequence listing submitted in support of the 
present application. 

Respectfully submitted, 

Kelley S. B^sjunas, Paralegal 

SENNIGER, POWERS, LEAVITT & ROEDEL 

One Metropolitan Square, 16th Floor 

St. Louis, MO 63102 

(314) 231-5400 



CERTIFICATE OF MAILING 

I certify that the foregoing Statement under 37 C.F.R. 1.821(f) is being deposited with the 
United States Postal Service as Express Mail #EL493155860US, in an envelope addressed to: 
Assistant Commissioner for Patents, Box Patent Application, Washington, D.C. 20231 on this 
11th day of August, 2000. 

Branay Melton 



SEQUENCE LISTING 



<110> WASHINGTON UNIVERSITY 

<120> ANT I -BACTERIAL COMPOUNDS DIRECTED AGAINST PILUS 

BIOGENESIS, ADHESION AND ACTIVITY; CO -CRYSTALS OF PILUS 
SUBUNITS AND METHODS OP USE THEREOF 

<130> WSHU2005.1 

<140> 

<141> ■ 

<150> US 60/148,280 
<151> 1999-08-11 

<160> 56 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 1 

Asn Val Leu Gin lie Ala Leu 

1 5 



<210> 2 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 2 

Gly Lys Val Thr Phe Asn Gly Thr Val Val 
15 10 



1 



<210> 3 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 3 

Gly Thr Val His Phe Lys Gly Glu Val Val 
15 10 



<210> 4 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 4 

Gly Lys Val Thr Phe Phe Gly Lys Val Val 
15 10 



<210> 5 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 5 

Gly Thr lie Val lie Thr Gly Thr He Thr 
15 10 



<210> 6 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 6 

Gly Thr He Val He Thr Gly Ser He Ser 
15 10 



<210> 7 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 7 

Gly Thr Val Lys Phe Val Gly Ser He He 
15 10 



<210> 8 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 8 

Gly Glu He Gin Leu Lys Gly Glu He Val 
15 10 



<210> 9 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 9 

Gly Thr He Lys Phe Thr Gly Glu He Val 
15 10 



3 



<210> 10 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 10 

Asn Glu Val Thr Phe Leu Gly Ser Val Ser 
15 10 



<210> 11 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 11 

Gly Thr lie Asn Phe Glu Gly Ser Val Val 
15 10 



<210> 12 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 12 

Ser Asp Val Ala Phe Arg Gly Asn Leu Leu 
15 10 



<210> 13 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 13 

Gly Arg Ala Ala Phe His Gly Glu Val Val 
15 10 



<210> 14 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 14 

Gly Arg Ala Thr Phe His Gly Glu Val Val 
15 10 



<210> 15 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 15 

Asp Asn Leu Thr Phe Arg Gly Lys Leu lie 
15 10 



<210> 16 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 16 
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Asp Asn Leu Thr Phe Lys Gly Lys Leu He 
15 10 



<210> 17 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 17 

Gly Trp Leu Asn Leu Gin Gly Thr He Leu 
15 10 



<210> 18 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 18 

Ser Val Val Asn lie Thr Gly Asn Val Gin 
15 10 



<210> 19 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 19 

Thr Thr He Thr Val Thr Gly Asn Val Leu 
15 10 



<210> 20 
<211> 10 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 20 

Thr Thr lie Thr Val Thr Gly Arg Val Leu 
15 10 



<210> 21 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 21 

Cys Met Leu Ala Gly Ser Asn Phe Val Thr 
15 10 



<210> 22 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 22 

Val Gin lie Asn He Arg Gly Asn Val Tyr 
15 10 



<210> 23 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 
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<400> 23 

Pro Asn Leu Lys Leu Phe Gly Thr Leu Leu 
15 10 



<210> 24 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 24 

Val Tyr lie Asn lie Thr Gly Asn Val He 
15 10 



<210> 25 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 25 

Gly Lys He Thr Phe Asn Gly Lys Val Val 
15 10 



<210> 26 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 



<400> 26 

Gly Thr lie Asn Phe Asn Gly Lys He Thr 
15 10 



<210> 27 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 27 

Gin Lys Thr He Phe Ser Ala Asp Val Val 



<210> 28 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 28 

Gly Gin Val Asn Phe Phe Gly Lys Val Thr 
1 S 10 



<210> 29 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 29 

Gin Arg Thr He He Thr Ala Asp Val Val 
1 5 10 



<210> 30 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Synthesized 
Sequence 



<400> 30 

Gly Ser Leu Ser Leu Ala lie 
1 5 



<210> 31 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 31 

Asn Tyr Leu Gin Phe Ala lie 
1 5 



<210> 32 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 32 

Ser Gly lie Ala Val Ala Leu 
1 5 



<210> 33 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 33 

Asn lie Leu Gin Leu Ala lie 
1 5 
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<210> 34 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 34 

Ser Phe Met Gin He Ala He 
1 5 



<210> 35 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 35 

Asn Tyr Leu Gin Phe Ala Val 
1 5 



<210> 36 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 36 

Asn Thr Leu Gin Leu Ala He 
1 5 



<210> 37 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 37 

Gly Val Leu Gin Leu Thr lie 
1 5 



<210> 38 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 38 

Asn Val Leu Ala Val Ala Val 
1 5 



<210> 39 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 39 

Ser Leu Leu Gin Leu Ala Phe 
1 5 



<210> 40 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 40 
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Ser Gly He Ala Val Ala Val 
1 5 



<210> 41 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 41 

Asn Ala Leu Lys Phe Ala Met 
1 5 



<210> 42 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 42 

Asn Val Leu Gin Met Ala Met 
1 5 



<210> 43 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 43 

Asn Tyr Leu Gin Phe Ala He 
1 5 



<210> 44 
<211> 7 
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<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 44 

Asn Val Leu Gin lie Ala Val 
1 5 



<210> 45 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 45 

Leu Asn Val Asn Val Val Thr 
1 5 



<210> 46 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 46 

Val Phe Val Gin Phe Ala lie 
1 5 



<210> 47 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 
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<400> 47 

Met Lys Leu Asn Val Ser lie 
1 5 



<210> 48 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 48 

Met Asp He Gin Met Ser He 
1 5 



<210> 49 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 49 

Leu Asn He Leu Leu Ser Val 
1 5 



<210> 50 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 50 

Met Asn He Gin Val Ser Val 
1 5 



15 



<210> 51 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 51 

Asp Ser lie Asn He Ser He 
1 5 



<210> 52 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Sequence 

<400> 52 

Leu Asn Val Gin Leu Ser Val 
1 5 



<210> 53 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 53 

catcgctggc acaggaagga gc 

<210> 54 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
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<400> 54 

gttggtatga cccgcatcaa tcgc 



<210> 55 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Proteins 

<400> 55 

Asn Thr Leu Gin Leu Ala He He Ser Arg 
15 10 



<210> 56 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthesized 
Proteins 

<400> 56 

Asp Val Thr He Thr Val Asn Gly Lys 
1 5 
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